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GENES FROM 20ql3 AHPLICON AND THEIR USES 

Description of corresponding document: W09802539 



GENES FROM20qt3 AMPZICON AND THEIR USES 

CROSS ERENCE TO RELATED APPLICATIONS 

This application is a contiriuation-tn-part of USSN 08/785,532, filed on 

January 17, 1997, USSN 08/731 ,499 filed on October 16, 1996 and USSN 08/680,395 filed on Jury 15, 1998, 
which is related to copending U.S. Patent Application, USSN 08/546,1 30, filed October 20, 1995, each of 
which is incorporated herein by reference for all purposes. 

BACKGROUND OF THE INVENTION 

This invention pertains to the field of cytogenetics. More particularly this invention pertains to the 
identification of genes in a region of amplification at about20q1 3 in various cancers. The genes disclosed 
here can be used as probes specific for the 20q13 ampficon as well as for treatment of various cancers. 

Chromosome abnormalities are often associated with genetic disorders, degenerative diseases, and cancer. 
In particular, the deletion or multiplication of copies of whole chromosomes or chromosomal segments, and 
higher level amplifications of specific regions of the genome are common occurrences in cancer. See, for 
example Smith, et al., 

Breast Cancer Res. Treat, 18: Suppl. 1: 5-14 {1991, van de Vger & Nusse, Biochim. 

Bophys. Acta. 1072: 33-50 (1991), Sato.eal, Cancer. Res., 50: 7184-7189 (1990). In fact, the amplification 
and deletion of DMA sequences containing proto-oncogenes and tumor-suppressor genes, respectively, are 
frequently characteristic of tumorigenesis. 

Dutrillaux. et al., Cancer Genet. Cytogenet. 49 203-217 (1990). Clearly, the identification of amplified and 
deleted regions and the cloning of the genes involved is crucial both to the study of tumorigenesis and to the 
development of cancer diagnostics. 

The detection of amplified or deleted chromosomal regions has traditionally been done by cytogenetics. 
Because of the complex packing of DNA into the chromosomes, resolution of cytogenetic techniques has 
been limited to regions larger than about 10 Mb: approximately the width of a band in Giemsa-stained 
chromosomes. In complex karyotypes with multiple translocations and other genetic changes, traditional 
cytogenetic analysis is of little utility because karyotype information is lacking or cannot be interpreted. 
Teyssier, J.R., Cancer Genet. Cytogenet., 37: 103 (1989). Furthermore, conventional cytogenetic banding 
analysis is time consuming, labor intensive, and frequently difficult or impossible. 

More recently, cloned probes have been used to assess the amount of a given DNA sequence in a 
chromosome by Southern blotting. This method is effective even if the genome is heavily rearranged so as to 
eliminate useful karyotype information. 

However, Southern blotting only gives a rough estimate of the copy number of a DNA sequence, and does 
not give any information about the localization of that sequence within the chromosome. 

Comparative genomic hybridization (CGH) is a more recent approach to identify the presence and 
localization of amplified/deleted sequences. See Katlioniemi, et al., Science, 258: 818 (1992). CGH, like 
Southern blotting, reveals amplifications and deletions irrespective of genome rearrangement Additionally, 
CGH provides a more quantitative estimate of copy number than Souther blotting, and moreover also 
provides information of the localization of the amplified or deleted sequence in the normal chromosome. 

Using CGH, the chromosomal 20q1 3 region has been identified as a region that is frequently ampfrfied in 
cancers (see, e.g. Kallioniemi et at., Genomics, 20: 125-128 (1994)). Initial analysis of this region in breast 
cancer cell lines identified a region approximately 2 Mb on chromosome 20 that is consistently amplified. 

SUMMARY OF THE INVENTION 

The present invention relates to the identification of a narrow region (about 600 kb) within a 2 Mb amplicon 
located at about chromosome 20q1 3 (more precisely at 20q1 3.2) that is consistently amplified in primary 
tumors. In addition, this invention provides cDNA sequences from a number of genes which map to this 
region. These sequences are useful as probes or as probe targets for monitoring the relative copy number of 
corresponding sequences from a biological sample such as a tumor cell. Also provided is a contig (a series 
of clones that contiguously spans this amplicon) which can be used to prepare probes specific for the 
amplicon. The probes can be used to delect chromosomal abnormalities at 20q1 3. 

Thus, in one embodiment, this invention provides a method of detecting a chromosome abnormality (e.g., an 
amplification or a deletion) et about position FLpter 0.825 on human chromosome 20(20q13.2). The method 
involves contacting a chromosome sample from a patient with a composition consisting essentially of one or 
more labeled nucleic acid probes each of which binds selectively to a target polynucleotide sequence at 
about position FLpter 0.825 on human chromosome 20 under conditions in which the probe forms a stable 
hybridization complex with the target sequence; and detecting the hybridization complex. The step of 
detecting the hybridization complex can involve determining the copy number of the target sequence. The 
probe preferably comprises a nucleic acid that specifically hybridizes under stringent conditions to a nucleic 
acid selected from the nucleic acids disclosed here. Even more preferably, the probe comprises a 
subsequence selected from sequences set forth in SEQ. ID. Nos. 1-10 and 12. 

The probe is preferably labeled, and is more preferably labeled with cftgoxigenin or biotin. 

In one embodiment, the hybridization complex is detected in interphase nudei in the sample. Detection is 
preferably carried out by detecting a fluorescent label (e.g., FfTC, fluorescein, or Texas Red). The method 
can further involve contacting the sample with a reference probe which binds selectively to a chromosome 20 
centromere. 

This invention also provides for two new genes, ZABC1 and tbl, in the20qO.2 region that are both amplified 
and overexpressed in a variety of cancers. ZABCI is a putative zinc finger protein. Zinc finger proteins are 
found in a variety of transcription factors, and amplification or overexpressbn of transcription factors typically 
results In cellular mis-regulation. ZABC1 and Ibl thus appear to play an important role in the etiology of a 
number of cancers. 

This invention provides for a new human cyclophifin nucleic acid (SEQ ID 

NO 13). Cycloph'din nucleic acids have been impficated in a variety of cellular processes, including signal 
transduction. 

This invention also provides for proteins encoded by nucleic acid sequences in the 20q13 ampficon (SEQ. 
ED. Nos: 1-10 and 12-13) and subsequences, more preferably subsequences of at least 10 amino acids, 
preferably of at least 20 amino acids and most preferably of at least 30 amino acids in length. Particularly 
preferred subsequences are epitopes specific to the 20q13 proteins, more preferably epitopes specific to the 
ZABC1 andibi proteins. Such proteins include, but are not limited to isolated polypeptides comprising at least 
20 amino acids from a polypeptide encoded by the nucleic acids of SEQ. ID No. 1-10 and 12-13 or from the 
polypeptide of SEQ. ID. No. 1 1 wherein the polypeptide, when presented as an immunogen. elicits the 
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production of an antibody which specifically binds to a polypeptide selected from the group consisting of a 
polypeptide encoded by the nucleic acids of SEQ. ID No. 1-10 and 12-13 or from the polypeptide of SEQ. ID. 
No. 11, where the polypeptide does not bind to antisera raised against a polypeptide selected from the group 
consisting of a polypeptide encoded by the nucleic acids of SEQ. ID No. 1-10 and 12-1 3 or from the 
polypeptide of SEQ. ID. No. 1 1 which has been fuBy imrnunosorbed with a polypeptide selected from the 
group consisting of a polypeptide encoded by the nucleic acids of SEQ. ID No. 1-10 and 12-13 or from the 
polypeptide of SEQ. ID. No. 1 1 . In preferred embodiments, the polypeptides of the invention hvtxidize to 
antisera raised against a polypeptide encoded by those encoded by 

SEQ ID NOs. 1-13, where the antisera has been imrnunosorbed with the most structurally related previously 
known polypeptide. For example, a polypeptide of the invention binds to antisera raised against a 
polypeptide encoded by SEQ ID NO. 13, wherein the antisera has been imrnunosorbed with a rat or mouse 
cyclophilin polypeptide (Rat cyclophilin nucleic acids are known; see, GenBankTM under accession No. 
M 19533; Mouse cyclophilin nucleic acids are known; see, GenBankTM under accession No. 50620. cDNAs 
from the mouse and rat cyclophilin cDNAs are about 85% identical to SEQ ID NO. 1 3). 

In another embodiment, the method can involve detecting a polypeptide (protein) encoded by a nucleic acid 
(ORF) in the 20q 13 amplicon. The method may include any of a number of well known protein detection 
methods including, but not limited to, the protein assays disclosed herein. 

This invention also provides cDNA sequences from genes in the amplicon (SEQ. ID. Nos. 1-1 0 and 12-13). 
The nudeic acid sequences can be used in therapeutic applications according to known methods for 
modulating the expression of the endogenous gene or the activity of the gene product Examples of 
therapeutic approaches include antisense inhibition of gene expression, gene therapy, monoclonal 
antibodies that specifically bind the gene products, and the like. The genes can also be used for recombinant 
expression of the gene products in vitro. 

This invention also provides for proteins (e.g., SEQ. ID. No. 1 1 ) encoded by the cDNA sequences from 
genes in the amplicon (e.g., SEQ. ID. Nos. 1-10 and 1213). Where the amplified nucleic acids include cDNA 
which are expressed, detection and/or quantification of the protein expression product can be used to identify 
the presence or absence or quantify the amplification level of the amplicon or of abnormal protein products 
produced by the amplicon, 

The probes disclosed here can be used in kits for the detection of a chromosomal abnormality at about 
position FLpter 0.825 on human chromosome 20. The kits include a compartment which contains a labeled 
nucleic acid probe which binds selectively to a target polynucleotide sequence at about FLpter 0.825 on 
human chromosome 20. The probe preferably includes at least one nucleic acid that specifically hybridizes 
under stringent conditions to a nucleic acid selected from the nucleic acids disclosed here. Even more 
preferably, the probes comprise one or more nucleic acids selected from the nucleic acids disclosed here. In 
a preferred embodiment, the probes are labeled with digoxigenin or twotin. The kit may further include a 
reference probe specific to a sequence in the centromere of chromosome 20. 

Definitions 

A "nucleic acid sample" as used herein refers to a sample comprising DNA in a form suitable for 
hybridization to a probes of the invention. The nucleic acid may be total genomic DNA, total mRNA, genomic 
DNA or mRNA from particular chromosomes, or selected sequences (e.g. particular promoters, genes, 
amplification or restriction fragments, cDNA,etc) within particular amplioons disclosed here. The nucleic acid 
sample may be extracted from particular cells or tissues. The tissue sample from which the nucleic acid 
sample is prepared is typically taken from a patient suspected of having the disease associated with the 
amplification being detected. In some cases, the nucleic acids may be amplified using standard techniques 
such as PCR, prior to the hybridization. 

The sample may be isolated nucleic adds immobilized on a solid surface (e.g., nitrocellulose) for use in 
Southern or dot blot hybridizations and the like. The sample may also be prepared such that individual 
nucleic acids remain substantially intact and comprises interphase nuclei prepared according to standard 
techniques. A "nucleic acid sample" as used herein may also refer to a substantially intact condensed 
chromosome (e.g. 

a metaphase chromosome). Such a condensed chromosome is suitable for use as a hybridization target in in 
situ hybridization techniques (e.g. FISH ). The particular usage of the term "nucleic acid sample" (whether as 
extracted nucleic acid or intact metaphase chromosome) will be readily apparent to one of skill in the art from 
the context in which the term is used. For instance, the nucleic acid sample can be a tissue or cell sample 
prepared for standard in situ hybridization methods described be tow. The sample is prepared such that 
individual chromosomes remain substantially intact and typically comprises metaphase spreads or 
interphase nuclei prepared according to standard techniques. 

A 'chromosome sample" as used herein refers to a tissue or ceD sample prepared for standard in situ 
hybridization methods described below. The sample is prepared such that individual chromosomes remain 
substantially intact and typically comprises metaphase spreads or interphase nuclei prepared according to 
standard techniques. 

•Nucleic acid" refers to a oeoxyribonudeotide or ribonucleotide polymer in either single- or double-stranded 
form, and unless otherwise limited, would encompass known analogs of natural nucleotides that can function 
in a similar manner as naturally occurring nucleotides. 

An "isolated" polynucleotide is a polynucleotide which is substantially separated from other contaminants that 
naturally accompany it, e.g., protein, lipids, and other polynucleotide sequences. The term embraces 
polynucleotide sequences which have been removed or purified from their naturally -occurring environment or 
clone library, and induce recombinant or cloned DNA isolates and chemically synthesized analogues or 
analogues biologically synthesized by heterologous systems. 

"Subsequence" refers to a sequence of nucleic acids that comprise a part of a longer sequence of nucleic 
acids. 

A "probe" or a "nucleic acid probe", as used herein, is defined to be a collection of one or more nucleic acid 
fragments whose hybridization to a target can be detected. The probe may be unlabeled or labeled as 
described below so that its binding to the target can be detected. The probe is produced from a source of 
nucleic acids from one or more particular (preselected) portions of the genome, for example one or more 
clones, an isolated whole chromosome or chromosome fragment, or a collection of polymerase chain 
reaction (PCR) amplification products. The probes of the present invention are produced from nudeic acids 
found in the 20q13 amplicon as described herein. The probe may be processed in some manner, for 
example, by blocking or removal of repetitive nudeic acids or enrichment with unique nudeic acids. Thus the 
word "probe" may be used herein to refer not only to the detectable nucleic acids, but to the detectable 
nucleic acids in the form in which they are applied to the target for example, with the blocking nudeic acids, 
etc The blocking nucleic acid may also be referred to separately. What "probe" refers to specifically is clear 
from the context in which tho word ts used. 

The probe may also be isolated nucleic acids immobilized on a solid surface (e.g., nitroceDutose). In some 
embodiments, the probe may be a member of an array of nucleic acids as described, for instance, in 
W0951 17958. Techniques capable of producing high density arrays can also be used for this purpose (see, 
e.g., Fodor et al. 

Science 767-773 (1991) and U.S. Patent No. 5,143.854). 



http://v3.espacenetx^^ 2/1/2008 



esp@cenet description view 



Page 3 of 21 



"Hybridizing" refers the binding of two single stranded nucleic acids via complementary base pairing. 

•Bindfs) substantially" or lands specifically" or "binds selectivey or "hybridizes specifically" refer to 
complementary hybridization between an oligonucleotide and a target sequence and embraces minor 
mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the 
desired detection of the target polynucleotide sequence. These terms also refer to the binding, duplexing, or 
hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that 
sequence is present in a complex mixture (e.g.. total cellular} DNA or RNA. The term "stringent conditions" 
refers to conditions under which a probe will hybridize to its target subsequence, but to no other sequences. 
Stringent conditions are sequence-dependent and will be different in different circumstances. "Stringent 
hybridaation" and "Stringent hybridization wash conditions" in the context of nucleic acid hybridization 
experiments such as CGH, FISH, Southern and northern hybridizations are sequence dependent, and are 
different under different environmental parameters. An extensive guide to the hybridization of nucleic acids is 
found in Tyssen (1 993) Laboratory 

Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Acid 
Probes part I chapter 2 "overview of principles of hybridization and the strategy of nucleic acid probe 
assays", Elsevier, New York. Generally, highly stringent hybridization and wash conditions are selected to be 
aboutS C tower than the thermal melting point (torn) for the specific sequence at a defined tonic strength and 
ph. The T, is the temperature (under defined ionic strength and pH) at which 50% of the target sequence 
hybridizes to a perfectly matched probe. Very stringent conditions are selected to be equal to the T, for a 
particular probe. 

An example of stringent hybridization conditions for hybridization of complementary nucleic acids which have 
more than 100 complementary residues on a filter in a Southern or northern blot is 50% formalin with 1 mg of 
heparin at 420C, with the hybridization being carried out overnight. An example of stringent wash conditions 
is a .2x SSC wash at65T for 15 minutes (see, Sam brook, supra for a description of SSC buffer). Often, the 
high stringency wash is preceded by a low stringency wash to remove background probe signal. An example 
medium stringency wash for a duplex of, e.g., about 100 nucleotides orrnore. is tx SSC at45X for 15 
minutes. An example low stringency wash for a duplex of, e.g., more than 100 nucleotides, is 4x SSC at40X 
for 1 5 minutes. In general, a signal to noise ratio of 2x (or higher) than that observed for an unrelated probe 
in the particular hybridization assay indicates detection of a specific hybridization. 

One of skill will recognize that the precise sequence of the particular probes described herein can be 
modified to a certain degree to produce probes that are "substantially identical" to the disclosed probes, but 
retain the ability to bind substantially to the target sequences. Such modifications are specifically covered by 
reference to the individual probes herein. The term "substantial identity" of polynucleotide sequences means 
that a polynucleotide comprises a sequence that has at least 90% sequence identity, more preferably at 
!east95%, compared to a reference sequence using the methods described below using standard 
parameters. 

Two nucleic acid sequences are said to be "identical" if the sequence of nucleotides in the two sequences is 
the same when aligned for maximum correspondence as described below. The term "complementary to" Is 
used herein to mean that the complementary sequence is identical to all or a portion of a reference 
polynucleotide sequence. Nucleic acids which do not hybridize to complementary versions of each other 
under stringent conditions are stiP substantially identical if the polypeptides which they encode are 
substantially identical. This occurs, e.g., when a copy of a nucleic acid is created using the maximum codon 
degeneracy permitted by the genetic code. 

Sequence comparisons between two (or more) polynucleotides are typically performed by comparing 
sequences of the two sequences over a "comparison window" to identify and compare local regions of 
sequence similarity. A "comparison window", as used herein, refers to a segment of at least about 20 
contiguous positions, usually about 50 to about 200, more usually about 100 to about 1 50 in which a 
sequence may be compared to a reference sequence of the same number of contiguous positions after the 
two sequences are optimally aligned 

Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith 
and WatermanAdv. Appl. Math. 2: 482 (1981), by the homology alignment algorithm of NeedJeman and 
Wunsch J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson and Lipman Proc. Natl. 
Acad. Sct.(U.SA) 85: 2444 (1988), by computerized implementations of these algorithms. 

"Percentage of sequence identity" is determined by comparing two optimally aligned sequences over a 
comparison window, wherein the portion of the polynucleotide sequence in the comparison window may 
comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not 
comprise additions or deletions) for optimal abgnmeni of the two sequences. The percentage is calculated by 
determining the number of positions at which the identical nude'c acid base or amino acid residue occurs in 
both sequences to yield the number of matched positions, dividing the number of matched positions by the 
total number of positions in the window of comparison and rnuRiptying the result by 100 to yield the 
percentage of sequence identity. 

Another indication that nucleotide sequences are substantially identical is if two molecules hybridize to the 
same nucleic acid sequence under stringent conditions. 

"Conservatively modified variations" of a particular nucleic acid sequence refers to those nucleic acids which 
encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an 
amino acid sequence, to essentially identical sequences. Because of the degeneracy of the genetic code, a 
targe number of functionally identical nucleic acids encode any given polypeptide. For instance, the codons 
CGU. 

CGC, CGA, CGG. AGA, and AGG all encode the amino acid argrurte. Thus, at every position where an 
argjnine is specified by a codon, the codon can be altered to any of the corresponding codons described 
without altering the encoded polypeptide. Such nucleic acid variations are "silent variations," which are one 
species of "conservatively modified variations." Every nucleic acid sequence herein which encodes a 
polypeptide also describes every possible silent variatioa One of skill will recognize that each codon in a 
nucleic acid (except AUG, which is ordinarily the only codon for methionine) can be modified to yield a 
functionally identical molecule by standard techniques. Accordingly, each "silent variation" of a nucleic acid 
which encodes a polypeptide is implicit in each described sequence. Furthermore, one of skin wBl recognize 
that individual substitutions, deletions or additions which alter, add or delete a single amino acid or a small 
percentage of amino acids (typcaly less than 5 %, more typically less than 1 % ) in an encoded sequence 
are "conservatively modified variations" where the alterations result in the substitution of an amino acid with 
a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids 
are weD known in the art 

The following six groups each contain amino acids that are conservative substitutions for one another. 

1) Alanine (A), Serine (S), Threonine (T); 

2) Aspartic acid (D). Glutamic acidQ; 

3) Asparagine (N), Gtutamine (Q); 

4) Argjnine (R). Lysine (K); 

5) Isoteurira (I), Leucine (L). Methionine (M), Valine (V); and 

6) Prtenytalanine (F). Tyrosine (Y), Tryptophan (W). 

The term "20ql3 amplicon protein" is used herein to refer to proteins encoded by ORFs in the20q13 
ampBcon disclosed herein. Assays that detect 20q13 amplcon proteins are intended to detect the level of 
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endogenous {native)20q13 amp [icon proteins present in subject biological sample. However, 
exogenou$20q1 3 amplicon proteins (from a source extrinsic to the biological sample) may be added to 
various assays to provide a label or to compete with the native 20q 13 ampGcon proton in binding to an anti- 
20q13 amplicon protein antibody.. One of skill will appreciate that a 20q13 amplicon protein mimetic may be 
used in place of exogenous 20q13 protein in this context. A "20q1 3 protein", as used herein, refers to a 
molecule thai bears one or more 20q13 amplicon protein epitopes such that it is specifically bound by an 
antibody that specifically binds a native 20q13 amplicon protein. 

As used herein, an "antibody" refers to a protein consisting of one or more polypeptides substantially 
encoded by immunoglobulin genes or fragments of trrvnunoglobutin genes. The recognized immunoglobulin 
genes include the kappa, lambda, alpha, gamma, delta, epsiton and mu constant region genes, as well as 
the myriad immunoglobulin variable region genes. Light chains are classified as either kappa or lambda. 
Heavy chains are classified as gamma, mu, alpha, delta, or epsiton, which in turn define the immunoglobulin 
classes, IgG, IgM, IgA, JgD and IgE, respectively. 

The basic immunoglobulin (antibody) structural unit is known to comprise a tetramer. Each tetramer is 
composed of two identical pairs of polypeptide chains, each pair having one Tight" (about 25kD) and one 
"heavy 1 chain (about 50-70 kD). The 

N-terminus of each chain defines a variable region of about 100 to 1 10 or more amino acids primarily 
responsible for antigen recognition. The terms variable light chain(V,) and variable heavy chain (VH) refer to 
these fight and heavy chains respectively. 

Antibodies may exist as intact immunoglobulins or as a number of well characterized fragments produced by 
digestion with various peptidases. Thus, for example, pepsin digests an antibody below the disulfide linkages 
in the hinge region to produceFfab)^ a dimer of Fab which itself is a light chain joined toVH CH1 by a 
disulfide bond. TheF(ab) *2 may be reduced under mild conditions to break the disulfide linkage in the hinge 
region thereby converting the F(ab)'2 dimer into an Fab' monomer. The Fab 1 monomer is essentially an Fab 
with part of the hinge region (see, Fundamental immunology, W E. Paul, ed, Raven Press, N.Y. (1993) for a 
more detailed description of other antibody fragments). While various antibody fragments are defined in 
terms of the digestion of an intact antibody, one of skill will appreciate that such Fab* fragments may be 
synthesized de novo either chemically or by utilizing recombinant DNA methodology. 

Thus, the term antibody, as used herein also includes antibody fragments either produced by the 
modification of whole antibodies or synthesized de novo using recombinant DNA methodologies. 

The phrase "specifically binds to a protein" or "specifically immunoreact/ve with", when referring to an 
antibody refers to a binding reaction which is determinative of the presence of the protein in the presence of 
a heterogeneous population of proteins and other biologies. Thus, under designated immunoassay 
conditions, the specified antibodies bind to a particular protein and do not bind in a significant amount to 
other proteins present in the sample. Specific binding to a protein under such conditions may require an 
antibody that is selected for its specificity for a particular protein. For example, antibodies can be raised to 
the a20q 13 amplicon protein that bind the20q 13 amplicon protein and not to any other proteins present in a 
biological sample. A variety of immunoassay formats may be used to select antibodies specifically 
immunoreactrve with a particular protein. For example, sol id-phase EL ISA immunoassays are routinely used 
to select monoclonal antibodies specifically immunoreactrve with a protein. See Harlow and 
Lane (1986) Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, New 
York, for a description of immunoassay formats and conditions that can be used to determine specific 
immunoreactrvrty. 

BRIEFOESCRIPIION OF THE DRAWINGS 

Figure 1(A) shows disease-free survival of 129 breast cancer patients according to the level of 20q13 
amplification. Patients with tumors having high level 20q1 3 amplification have a shorter disease-free survival 
(p=0.04 by Mantel-Cox test) compared to those having no or low level amplification. 

Figure 1 (B) Shows the same disease-free survival difference of Figure 1 (A) in the sub-group of 79 axillary 
node-negative patients (p=0.0022 by Mantel-Cox test). 

Figure 2 shows a comparison of 20q13 amplification detected by FISH in a primary breast carcinoma and its 
metastasis from a 29-year patient. A low level amplification of20q13 (20q13 compared to 20p reference 
probe) was found in the primary tumor. The metastasis, which appeared 8 months after mastectomy, shows 
a high level amplification of the chromosome 20q13 region. The overall copy primary tumors that further 
narrow the region of highest amplification to within about 600 kb. 

Figure 4 provides a higher resolution map of the ampKcon core as defined in primary tumors. 

Figure 5 shows the map location of 15 genes in the amplicon. 

Figure 6 shows a sequence alignment between Rat cyclophillin and SEQ ID 13. 

Figure 7 is a physical map of the 20q13 amplicon. Row A of the figure shows the position of STSs used to 
assemble the map. Row B shows the position of 

YACs. Row C shows the position ofP1 and BAC clones. Row D shows the position of trapped axons, direct 
selected cDNAs, ESTb, and genes. Row E shows results of hybridization analysis of various primary tumors 
and cell lines usingPI or BAC clones as noted there. A solid black ctrde indicates high amplification, a 
shaded circle indicates intermediate amplification and an open circle indicates no amplification as detected 
by each of the clones. 

DETAILED DESCRIPTION 

This invention provides a number of cDNA sequences which can be used as probes for the detection of 
chromosomal abnormalities at 20q 1 3. Studies using comparative genomic hybridization (CGH) have shown 
that a region at chromosome20q1 3 is increased in copy number frequently in cancers of the breast -30%), 
ovary (- 15%). btaddert-30%), head and neck(-75%) and coton(-80%). This suggests the presence of one or 
more genes that contribute to the progression of several solid tumors are located at20q13. 

Gene amplification is one mechanism by which dominantly acting oncogenes are overexpressed, allowing 
tumors to acquire novel growth characteristics and/or resistance to cftemotherapeuiic agents. Loci implicated 
in human breast cancer progression and amplified In 10-25% of primary breast carcinomas include the erbB- 
2 locus (Lupu et aL, Breast Cancer Res. Treat. 27: 83 (1993), Slam on et al Science, 235: 177-182 (1987), 
Hetskanen et at B'otechniques, 17: 928 (1994)) at 17q12, cyctin-D (Mahadevan et aL, Science,255: 1253- 
1255 (1993), Giltett et aL, Cane Res., 54: 1812 (1994)) atl 1q13 and MYC (Gaffey etal.. Mod Pathol, 6: 654 
(1993))at6q34. 

Pangenomic surveys using comparative genomic hybridization (CGH) recently identified about 20 novel 
regions of increased copy number in breast cancer {KaJbortiemi et at, Genomics, 20: 125-128 (1994)). One 
of these loci, band 20q13, was amplified in 18% of primary tumors and 40% of ceD fries (KaJ&onierni et al. 
Genomics, 20:125-128 (1994)). More recently, this same region was found amplified in15% of ovarian, 80% 
of bladder and 80% of colorectal tumors. The resolution of CGH is limited to 5-10 Mb. Thus, FISH was 
performed using locus specific probes to confirm the CGH data and precisely map the region of amplification. 

The 20q13 region has been analyzed in breast cancer at the molecular level and a region, approximately 600 
kb wide, that is consistently amplified was identified, as described herein. Moreover, as shown herein, me 
importance of this amplification in breast cancer is indicated by the strong association between amplification 



http://v3.espacenet.com/textdes?DB=EPODOC&IDX=EP09601 97&QPN=EP09601 97&RP... 2/1/2008 



esp@cenet description view 



Page 5 of 21 



and decreased patient survival and increased tumor proliferation (specifically, increased fraction of cells in S- 
phase). 

In particular, as explained in detail in Example 1 , high-level 20q1 3 amplification was associated (p=0.0022) 
with poor disease free survival in node-negative patients, compared to cases with no or low-feve! 
amplification (Figure 1 ). Survival of patients with moderately amplified tumors did not differ significantly from 
those without amplification. Without being bound to a particular theory, it is suggested that an explanation for 
this observation may be thai low level amplification precedes high level amplification In this regard, it may be 
significant that one patient developed a local metastasis with high-level 20q13.2 amplification 8 month after 
resection of a primary tumor with low level amplification (Figure 2). 

The 20q1 3 amplification was associated with high histologic grade of the tumors. This correlation was seen 
in both moderately and highly amplified tumors. There was also a correlation(p=O.0085) between high level 
amplification of a region complementary to a particular probe, RMC20C001 (Tanner at al., Cancer Res., 54: 
42574260 (1994)), and cell proliferation, measured by the fraction of cells in S -phase. 

This finding is important because it identifies a phenotype that can be scored in functional assays, without 
knowing the mechanism underlying the increased S -phase fraction. The 20q13 amplification did not correlate 
with the age of the pattern, primary tumor size, axillary nodal or steroid hormone-receptor status. 

This work localized the 20q1 3.2 amplicon to an interval of approximately 2 

Mb. Furthermore, it suggests that high-level amplification, found in 7% of the tumors, confers an aggressive 
phenotype on the tumor, adversely affecting clinical outcome. Low level amplification (22% of primary 
tumors) was associated with pathological features typical of aggressive tumors (high histologic grade, 
aneuploidy and cell proBferation) but not patient prognosis. 

In addition, it is shown herein that the20q13 am pt icon (more precisely the 20q13.2 amplicon) is one of three 
separate co-amplified loci on human chromosome 20 that are packaged together throughout the genomes of 
some primary tumors and breast cancer cell lines. No known oncogenes map in the 20q13.2 amplicon. 

Identification of2or1 3 Amplicon Probes. 

Initially, a paucity of available molecular cytogenetic probes dictated that 

FISH probes be generated by the random selection of cosmids from a chromosome 20 specific library, 
LA20NC01 , and then mapped them to chromosome 20 by digital imaging microscopy. Approximately 46 
cosmids, spanning the 70 Mb chromosome, were isolated for which fractional length measurements (FLpter) 
and band assignments were obtained. 

Twenty six of the cosmids were used to assay copy number in the breast cancer cell line 

BT474 by interphase FISH analysis. Copy number was determined by counting hybridization signals in 

interphase nuclei. This analysis revealed that cosmid 

RMC20C001 (FLpter, 0.824; 20q13.2), described by Stokke etal., Genomics, 26: 134-137 (1995), defined 
the highest-level amplification C60 copies/cell) in BT474 cells (Tanner er al., Cancer Res., 54: 42574260 
(1994)). 

P1 clones containing genetically mapped sequences were selected from 20q1 3.2 and used as FISH probes 
to confirm and further define the region of amplification. OtherPI clones were selected for candidate 
oncogenes broadly localized to the20q!3.2 region (Flpter, 0.81 -0.84). These were selected from the 
DuPontPI library (Shepherd, et al., Proc Natl. Acad. Sci. USA, 92: 2629 (1994), available commercially from 
Genome Systems), by PCR (Saiki et al, Science, 230: 1350 (1985)) using primer pairs developed in the 3* 
untranslated region of each candidate gene. Gene spectficPI clones were obtained for, protein tyrosine 
phosphatase(PTPNI, Flpter 0.78), melanocortin 3 receptor (MC3R, Flpter 0.81), phosphoenolpyruvate 
carboxy kinase(PCK1, Flpter 0.85), zinc finger protein 8 (2NF8, Flpter 0.93), guanine nucteotide-binding 
protein (GNAS 1, Flpter .873), src-oncogene (SRC, Flpter 0.669), topoisomerase 1 (TOP1, Flpter 0.675), the 
bd-2 related gene bcl-x (Flpter 0 526) and the transcription factor E2F-1 (FLpter 0.541 ). Each clone was 
mapped by digital imaging microscopy and assigned Flpter values. Five of these genes (SRC, TOP01 , 
GNAS1. 

E2F-1 and BCI-x) were excluded as candidate oncogenes in the amplicon because they mapped well outside 
the critical region at Ftpter 0.81-0.84. Three genes (PTPNR1, 

PCK-1 and MC3R) localized dose enough to the critical region to warrant further investigation. 

Interphase FISH on 14 breast cancer cell lines and 36 primary tumors using 24 cosmid and 3 gene specific 
P1(PTPNRL, PCK-1 and MC3R) probes found high level amplification in 35% (5/14) of breast cancer cell 
lines and 8% (3/36) of primary tumors with one or more probe. The region with the highest copy number in 
4/5 of the cell lines and 3/3 primary tumors was defined by the cosmid RMC20C001. This indicated that 
PTPNR1, PCK1 and MC3R could also be excluded as candidates for oncogenes in the amplicon and, 
moreover, narrowed the critical region from 10 Mb to 1 .5-2.0 Mb (see, 
Tanner et al., Cancer Res., 54: 42574260 (1994). 

Because probe RMC20C001 detected high-level (3 totefold) 20ql3.2 amplification in 35% of cell Ones and 8% 
of primary tumors it was used to (1 ) define the prevalence of amplification in an expanded tumor population, 
(2) assess the frequency and level of amplification in these tumors, (3) evaluate the association of the 
20q13.2 amplicon with pathological and biological features, (4) determine rf a relationship exists between 
20q13 amplification and clinical outcome and (5) assess 20q1 3 amplification in metastatic breast tumors. 

As detailed in Example 1 , fluorescent in situ hybridization (FISH) with 

RMC20C001 was used to assess20q!3.2 amplification in 132 primary and 1 1 recurrent breast tumors. The 
absolute copy number (mean number of hybridization signals per cell) and the level of amplification (mean 
number of signals relative to the p-arm reference probe) were determined. Two types of amplification were 
found: Low level amplification (1 .5-3 fold with FISH signals dispersed throughout the tumor nuclei) and high 
level amplification (> 3 fold with tightly clustered FISH signals). Low level 20q1 3.2 amplification was found in 
29 of the 132 primary tumors (22%), whereas nine cases{6.6So) showed high level amplification. 

RMC20COO1 and four flanking P1 probes (MC3R, PCK, RMC20C026, and 
RMC20C030) were used to study the extent of DNA amplification in highly amplified tumors. Only 
RMC20COO1 was consistently amplified in ail tumors. This finding confirmed that the region of common 
amplification is within a 2 Mb interval flanked by but not including PCK-1 and MC3R. 

A physical map was assembled to further localize the minimum common region of amplification and to isolate 
the postulated oncogenes). The DuPontPI library (Shepherd et al ProcNataL Acad. Sci. USA, 91: 2629 
(1994) was screened for STSs likely to map in band 20q13.ZPl clones at the loci D20S102.Q20S100, 
D20S120. 

D20S183, D20S460, D20S211 were isolated, and FISH focalized each to 20q13.Z 

Interphase FISH analysts was then performed in the breast cancer celt Ene BT474 to assess the amplification 
level at each locus. The kxiD20S 100-O20S 1 20-D20S 163-D2OS48eD20S211 were Nghry amplified in the 
BT474 cell line, whereas D20S102 detected only low level amplification. Therefore, 5 STSs. spanning 5 cM. 
were localized within the 20q13.2 amplicon and were utilized to screen the 
CEPH megaYAC Itorary. 

CEPH megaYAC library screening and computer searches of public databases revealedD20S12O-O20S183- 
D20S480-D20S21 1 to be linked on each of three megaYAC donesy820fS, 773h10, and 931h6 (Figure 3). 
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Moreover, screening the CEPH megaYAC library with STSs generated from the ends of cosmids 
RMC20CO01. 

RMC20C3O and RMC20C028 bcalizedRMC20C001 to each of the same three YAC clones. H was 

estimated, based on the size of the smallest of these YAC clones, that 

D20S 1 20-D20S 1 83-RMC20C00 1 -O20S480-O20S2 1 1 map into an interval of less than 1.1 

Mb. O20S10O was localized 300 kb distal to D20S120 by interphase FISH and toYAC90S>C by STS 

mapping. The combined STS data made it possible to construct a 12 member YAC contig which spans 

roughly 4 Mb encompassing the 1 .5 Mb amplicon and containing the lociRMC20C030-PCK1 -RMC20C001 - 

MC3R-RMC20C026. Each YAC was mapped by FISH to confirm localization to 20q13.2 and to check for 

chimerism. Five clonal isolates of each YAC were sized by pulsed field gel electrophoresis (PFGE). None of 

the YACs are chimeric, however, several are highly unstable. 

The YAC contig served as a framework from which to construct a 2 MbP1 contig spanning the 20Q13 
ampBcon.P1 clones offered numerous advantages over YAC clones including (1) stability, (2) a chimeric 
frequency of less than 1%, (3) DNA isolation by standard mini prep procedures, (4) they make ideal FISH 
probes, (5) the ends can be sequencedindirectly, (6) engineered y6 transposons carrying bidirectional primer 
binding sites can be integrated at any position in the cloned DNA (Strathmannel al., Proa Natl. 

Acad. Sci. USA, 68: 1247 (1991 )) (7) P1 clones are the templates fa sequencing the human and Drosophila 

genomes at the LBNL HGC (Patazzoto et al. DOE Human Genome 

Program, Contractor-Grantee Woricshop IV. Santa Fe, New Mexico, November 13-17 1994). 

About 90P1 clones were isolated by screening the DuPontPI library either by PCR or filter hybridizatioa For 
PCR based screening, more than 22 novel STSs were created by two methods. In the first method, the ends 
ofPI clones localized to the amplicon were sequenced, STSs developed, and theP1 library screened for 
walking clones. In the second approach inter-Alu PCR (Nelson el al., 66: 6588-6690 (1 989)) was performed 
on YACs spanning the amplicon and the products cloned and sequenced for 

STS creation. In the filter based hybridization schemePI clones were obtained by performing inter-Alu PCR 
on YACs spanning the amplicon, radio-labeling the products and hybridizing these against filters containing a 
gridded array of theP1 library. Finally, to close gaps a human genomic bacterial artificial chromosome (BAC) 
library (Shizuya etat. Proc. Natl. Acad Sci. USA, 69: 8794 (1992), commercially available from Research 
Genetics, Huntsville, Alabama, USA) was screened by PCR. These methods combined to produce more 
than 100P1 and BAC clones were localized to20ql3.2 by FISH. STS content mapping, fingerprinting, and 
free-chromatin fish (Heiskanen et al., 

BioTechniques, 17: 928 (1994)) were used to construct the 2 Mb contig shown in Figure 3 as weD as in 
Figure 4. Figure 7 is a second physical map of the region showing a minimum tiling contig. 

Fine Mapping the20g1 32 Amplicon in BT474 

Clones from the 2 MbP1 contig were used with FISH to map the level of amplification at 20q1 3.2 in the 
breast cancer cell line BT474. 35P1 probes distributed at regular intervals along the contig were used. The 
resulting data indicated that the region of highest copy number increase in BT474 occurs between D20S 100 
and D20S21 1 , an interval of approximately 1 .5 Mb.Pl FISH probes, in this interval, detect an average of 50 
signals per interphase nuclei in BT474, while no, or only low level amplification, was detected with theP1 
clones outside this region. Thus, both the proximal and distal boundaries of the amplicon were cloned. 

Fine Mapping the20g13.2 Anipllcon mPrimary Tumors. 

Fine mapping the amplicon in primary tumors revealed the minimum common region of high amplification 
(MCA) that is of pathobblogical significance. This process is analogous to screening for informative metosis 
in the narrowing of genetic intervals encoding heritable disease genes. Analysis of 132 primary tumors 
revealed thirty-eight primary tumors that are amplified at theRMC20C()OI locus. Nine of these tumors have 
Ngh level amplification at the RMC20COO1 locus arid were further analyzed by interphase FISH with 8 Pis 
that span the#2 Mb contig. The minimum common region of amplification (MCA) was mapped to a =600 kb 
interval flanked by P1 clones &num;3 and &num;12 with the highest level of amplification detected byP1 
clone &num:38 corresponding to 
RMC20C001 (Figures 3 and 7). 

The P1 and BAC clones spanning the 600 kb interval of the 20ql 3 amplicon are listed in Tables 1 and 2 
which provide a cross-reference to the DuPontPI library described by Shepherd, et al., Proc NatL Acad. Sci. 
USA, 92: 2629 (1994). These PI and BAC probes are available commercially from Genetic Systems, and 
Research 

Genetics, respectively). 

Tabele 1 . Cross reference to probes of the DuPont P1 library (Shephered, et al., Proc. Natl. Acad Set USA, 
92:2629 (1994) which is commercially available from 

Genomic Systems, St. Louis, Missouri, USA). PCR primers are illustrated for amplification of Sequence tag 
sites for each done. In addition, PCR conditions (Mg concentration and annealing temperature), as welt as 
PCR product size, is provided. Size: PCR product Size; mM MgCI: Mg concentration; Ana: Annealing 
temperature; P1 : P1 probe ID number; PC: DuPont Library Coordinates; SCA: DuPont Library Single Clone 
address; SECMorward and SEQ-backward: forward and backward PCR primers, respectively; SEQ ID NO:- 
forwaid and SEQ ID NO:-backward: Sequence Listing SEQ ID NO: for forward and backward primers, 
respectively. 
EM 120.1 



<SEP> PRIMER <SEP> NAME <SEP> SIZE <SEP> mM <SEP> SEQ <SEP> ID <SEP> SEQ <SEP> ID 
<SEP> NO: (bp) <SEP> MgCI <SEP> Ann. <SEP> PI <SEP> PC <SEP> SCA <SEP> SEQ-forward <SEP> 
SEQ-bacXward <SEP> No: -forward <SEP> backward 

<tb> 352.32 <SEP> 136 <SEP> 1.5 <SEP> 52 <SEP> 20 <SEP> 103-e5 <SEP> 1228e <SEP> 
TTGGCATTGGTATC AG GTAG C TG <SEP> TTGGAGCAGAGAGGGGATTGTGTG <SEP> 1 <SEP> 2 
<tb> 388.13 <SEP> F1/B1 <SEP> 201 <SEP> 1.5<SEP> 52 <SEP> 17 <SEP> 69g6 <SEP> 821 <SEP> 
AATCCCCTCAAACCCTGCTGCTAC <SEP> TGGAGCCTGAACTTCTGCAATC <SEP> 3 <SEP> 4 
<tb> 19 <SEP> 98f4 <SEP> 1 167f <SEP> ' <SEP> * <SEP> * <SEP> * 

<tb> D20S183 <SEP> 270 <SEP> 3 <SEP> 48 <SEP> 30 <SEP> 124g6 <SEP> CCGGGATACCGACATTG 

<SEP> TGCACATAAAACAGCCAGC <SEP> 5 <SEP> 6 

<tt» 40 <SEP> 24h1 <SEP> 276h <SEP> * <SEP> * <SEP> " <SEP> * 

<tb> D20S21 1 <SEP> <SEP> 135 <SEP> 1 .5 <SEP> 52 <SEP> 29 <SEP> 1 1914 <SEP> 1418f <SEP> 

TTGGAATCAATGGAGCAAAA <SEP> AGCTTTACC CAATGTGGTCC <SEP> 7 <SEP> 8 

<tb> D20S480 <SEP> 300 <SEP> 3 <SEP> 55 <SEP> 68 <SEP> 100d12 <SEP> 1199d2 <SEP> 

GTGGTGAACACCCAATAAATGG <SEP> AAGCAAATAAAACCAATAAACTCG <SEP> 9 <SEP> 10 

<tb> 41 <SEP> 86cl <SEP> 1020c <SEP> * <SEP> • <SEP> * <SEP> * 

<tt» 42 <SEP> 103d9 <SEP> 1232d <SEP> * <SEP> * <SEP> * <SEP> • 

<tb> 67 <SEP> 91 b2 <SEP> 1081b9 <SEP> * <SEP> • <SEP> • <SEP> • 

<tb> 9x-SP6 <SEP> hmFmmB <SEP> 165 <SEP> 1.5 <SEP> 55 <SEP> 7 <SEP> 31d11 <SEP> 370d 
<SEP> CAAGATCTGACCCCGTCAATC <SEP> GACTTC TTCAGGAAAGAGATCAGT G <SEP> 11 <SEP> 
12 

<tt» 9 <SEP> 351 9 <SEP> 416f <SEP> • <SEP> * <SEP> * <SEP> * 

<tb> 11S-17 <SEP> F27B4 <SEP> 146 <SEP> 3 <SEP> 58 <SEP> 11 <SEP> 41 b1 <SEP> 480b <SEP> 
GCCATGTACCCACCTGAAAAATC <SEP> TCAGAACACCCGTGCAGAATTAAG <SEP> 13 <SEP> 14 
<tt» 12T-T7 <SEP> F2/B1 <SEP> 153 <SEP> 3 <SEP> 58 <SEP> 12 <SEP> 42c2 <SEP> 493c <SEP> 
CCTAAAACTTGGTGCTTAAATCTA <SEP> GTCTCACAAGGCAGATGTGG <SEP> 15 <SEP> 16 
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<tb> 28T <SEP> F1/B1 <SEP> 219 <SEP> 1.5 <SEP> 52 <SEP> 74 <SEP> 888f2 <SEP> 
TTTGTGTATGTTGAGC CATC <SEP> CTTC CAATCTCATTCTATGAGG <SEP> 17 <SEP> 18 
<tt» 2 <SEP> 12c6 <SEP> 137c <SEP> • <SEP> * <SEP> * <SEP> * 
<tb> 25 <SEP> 118C11 <SEP> 1413c <SEP> * <SEP> * <SEP> * <SEP> * 
<tb> 26 <SEP> 118c11 <SEP> 1413c <SEP> • <SEP> * <SEP> * <SEP> • 

<lb>28S <SEP> F1/B3<SEP>214<SEP> 1.5<SEP>55<SEP>28<SEP> 118g11 <SEP> 1413g<SEP> 
GCTTGTTTAAGTGTCACTAGGG <SEP> CACTCTGGTAAATGACCTTTGTC <SEP> 19 <SEP> 20 
<tb> 25 <SEP> 118c1 1 <SEP> 1413c <SEP> * <SEP> * <SEP> * <SEP> * 
<tb> 27 <SEP> 118g11 <SEP> 141 3g <SEP> * <SEP> * <SEP> * <SEP> * 
<tb> 10 <SEP> 36f10 <SEP> 429f <SEP> * <SEP> * <SEP> * <SEP> * 

<tb> 69S <SEP> 100 <SEP> 3 <SEP> 55 <SEP> 69 <SEP> 412b5 <SEP> CCTACACCATTCCAACTTTGG 
<SEP> GCCAGATGTATGTTTGCTACGGAAC <SEP> 21 <SEP> 22 
<tt» 5 <SEP> 23c1 <SEP> 264c <SEP> * <SEP> • <SEP> * <SEP> * 

<tb> HSCOFH032 <SEP> F/B <SEP> 129 <SEP> 1.5 <SEP> 55<SEP> 3 <SEP> 12-e11 <SEP> 142e 

<SEP> TC TCAAACCTGTCCACTTCTTG <SEP> CTGCTGTGGTGGAGAATGG <SEP> 23 <SEP> 24 

<tb> 18 <SEP> 77a10 <SEP> 921a <SEP> * <SEP> * <SEP> * <SEP> * 

<tb> 

EMI21.1 



<SEP> PRIMER <SEP> NAME <SEP> SIZE <SEP> mM <SEP> SEQ <SEP> ID <SEP> SEQ <SEP> ID 
<SEP> NO: (bp) <SEP> MgCI <SEP> Ann. <SEP> PI <SEP> PC <SEP> SCA <SEP> SEQ-fbrward <SEP> 
SEQ-backward <SEP> NO: -forward <SEP> backwaard 

«b> 60A1 <SEP> 191 *SEP> 1.5 <SEP> 58 <SEP> 36 <SEP> 112g8 <SEP> 1139g <SEP> 
TGTCCTCCTTCTCCCTCATCCTAC <SEP> AATGCCTCCACTCACAGGAATG <SEP> 25 <SEP> 26 
<tb> 39 <SEP> 34a6 <SEP> 401a <SEP> * <SEP> • <SEP> * <SEP> * 

<tb> 820F5ATT <SEP> F1/B1 <SEP> 175 <SEP> 1.5 <SEP> 48 <SEP> 15 <SEP> 53c7 <SEP> 630c 
<SEP> CCTCTTCAGTGTCTTCCTATTGA <SEP> GGGAGGAGGTTGTAGGCAAC <SEP> 27 <SEP> 28 
<tb> 16 <SEP> 58b9 <SEP> 692b <SEP> * <SEP> * <SEP> * <SEP> * 
<tb> 10311f33T7F3/B3 <SEP> 1B5 <SEP> 1141D7 

<tb> Table 2. Cross reference to probes of the BAC library. Clone &num; refers to the clone number 
provided, e.g., in Figures 3. 4 and 7, while the plate coordinates are the plate coordinates in the Research 
Genetics BAC library. Size; mM MfCfc Mg concentration; Ann. Annealing temperature; BAC &num;: BAC 
probe ID number; SEQ^forward and SEQ-backward : forward and backward PCR primers, respectively. SEQ 
ID NO: -forward and SEQ ID NO: backward: Sequence Listing SEQ ID NO: for forward and backward primers; 
respectively. 
EMC 1.2 



<SEP> PRIMER <SEP> NAME <SEP> SIZE <SEP> mM <SEP> Ann. <SEP> BAC <SEP> BAC <SEP> 
Plate <SEP> SEQ-forward <SEP> SEQ-backward <SEP> SEQ <SEP> ID <SEP> NO:- <SEP> SEQ <SEP> 
ID <SEP> NO: 

<tb> (bp) <SEP> MgCI <SEP> &num, <SEP> Coordinates <SEP> forward <SEP> backward 

<tb> 18T <SEP> F1/B1 <SEP> 156 <SEP> 3 <SEP> 62 <SEP> 99 <SEP> L11 <SEP> plate <SEP> 146 

<SEP> AGCAAAGCAAAGGTGGCACAC <SEP> TGACATGGGAGAAGACACACTTCC <SEP> 29 <SEP> 

30 

<tb> 9S <SEP> F1/B1 <SEP> 214 <SEP> 1.5 <SEP> 55 <SEP> 97 <SEP> E8 <SEP> plate <SEP> 183 

<SEP> AGGTTTACCAATGTGTTTGG <SEP> TCTACATCCCATTCTCTTCTG <SEP> 31 <SEP> 32 

<tb> D20S480 <SEP> 300 <SEP> 3 <SEP> 55 <SEP> 95 <SEP> H1 5 <SEP> plate <SEP> 140 <SEP> 

GTGGTGAACACCAATAAATGG <SEP> AAGCAAATAAAACCAATAAACTCG <SEP> 9 <SEP> 10 

<tb> D20S211 <SEP> <SEP> 135 <SEP> 1.5 <SEP> 52 <SEP> 103 <SEP> A15 <SEP> plate <SEP> 188 

<SEP> TTGGAATCAATGGAGCAAAA <SEP> AGCTTTACCCAATGTGGTCC <SEP> 7 <SEP> 8 

<tb> 102 <SEP> Al <SEP> ptate <SEP> 46 <SEP> - <SEP> - <SEP> - <SEP> 11S-17 <SEP> F2/B4 

<SEP> 146 <SEP> 3 <SEP> 58 <SEP> 100 <SEP> E4 <SEP> plate <SEP> 43 <SEP> 

GCC ATGTACC CACCTGAAAAATC <SEP> TCAGAACACCCGTGCAGAATTAAG <SEP> 13 <SEP> 14 

<tb> 101 <SEP> J5 <SEP> plate <SEP> 118 <SEP> - <SEP> - <SEP> - <SEP> 

CYP24 <SEP> 87 <SEP> J14 <SEP> plate <SEP> 96 

<tt» ET4211 <SEP> 104 <SEP> C10 <SEP> plate <SEP> 754 

<tb> ET03.17 <SEP> 121 <SEP> C10 <SEP> ptate <SEP> 10 

<tb> D20S902 <SEP> 166 <SEP> 14 <SEP> plate <SEP> 226 

<tb> 180-R <SEP> F1/B1 <SEP> 188 <SEP> G21 <SEP> plate <SEP> 163 

<tb> 164-1Fflft>2 <SEP> 197 <SEP> N13 <SEP> plate <SEP> 309 

<tb> cDNA sequences from the20a13 amplicon. 

Exon trapping (see, e.g., Duyk et al, Proc. Natl. Acad. Sci. USA 87: 8995-8999 (1990) and Church et al., 
NatureGenetics, 6: 98-105 (1994)) was performed on theP1 and BAC clones spanning ths600 kb minimum 
common region of amplification and has isolated more than 200 exons. 

Analysis of the exons DNA sequence revealed a number of sequence sirnilaritie3(85% to 96%) to partial 
CDNA sequences in the expressed sequence data base (dbest) and to a S. cerevisiae chromosome XIV 
open reading frame. EachPI done subjected to exon trapping has produced multiple exons consistent with at 
least a medium density of genes. Over 200 exons have been trapped and analyzed as well as 200 clones 
isolated by direct selection from a BT474 cDNA library. In addition a 0.6 Mb genomic interval spanning the 
minimal amplicon described below is being sequenced. Exon prediction and gene modeling are carried out 
with XGRAiLSORFIND. and BLAST programs. Gene fragments identified by these approaches have been 
analyzed by RT-PCR, Northern and Southern blots. Fifteen unique genes were identified in this way (see. 
Table 3 and Figure 5). 

In addition two other genes ZABC1 (SEQ. ID. 9 and 10) and IbJ (SEQ 

ID No. 12) were also were shown to be overexpressed in a variety of different cancer cells. 

Sequence information from various cDNA clones are provided below. 

They are as follows: 

3bf4 (SEQ. ID. No. 1) - 3kb transcript with sequence identity to e tyrosine kinase gene, termed A6. disclosed 
in Beeleret aL Mot. CeD. BioL 14:982988 (1994) and WO 95/19439. These references, however, do not 
disclose thai the gene is located in the chromosome 20 ampHcon. 

1b1 1 (SEQ. ID. No, 2) - an approximately 3.5 kb transcript whose expression shows high correlation with the 
copy number of the amplicon. The sequence shows no homology with sequences in the databases 
searched. 

cc49(SEQ. ID. No. 3) - a 6-7 kb transcript which shows homology to 
C2H2 zinc finger genes. 

cc43 (SEQ. ID. No. 4) - an approximately 4 kb transcript which is expressed in normal tissues, but whose 
expression in the breast cancer ceH One has not been detected. 

41 .1 (SEQ. ID. No. 5) - shows homology to trtenomeobox T shirt gene in DrosophBa. 

GCAP (SEQ. ID. No. 6) - encodes a guanino cydase activating proteji which is involved in the biosynthesis 
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of cyclic AMP. As explained in detail below sequences from this gene can also be used for treatment of 
retinal degeneration. 

Ib4 (SEQ. ID. No. 7) - a serine threonine kinase. 

20sa7 (SEQ. ID No. 8) - a homolog of the rat gene, BEM-1. 

In addition, the entire nucleotide sequence is provided forZABC-1. 

ZABC-1 stands for zinc finger amplified in breast cancer. This gene maps to the core of tbe20qI3.2 amplicon 
and is overexpressed in primary tumors and breast cancer cell Unas having 20q13.2 amplification. The 
genomic sequence (SEQ. ID. No. 9) includes roughly 2kb of the promoter region. SEQ ID. No. 10 provides 
the cDNA sequence derived open reading frame and SEQ ID. No. 1 1 provides the predicted protein 
sequence. Zinc finger containing genes are often transcription factors that function to modulate the 
expression of down stream genes. Several known oncogenes are in fact zinc finger containing genes. 

This invention also provides the full length cDNA sequence for a cDNA designatedibi (SEQ. ID. No. 12) 
which is also overexpressed in numerous breast cancer cell lines and some primary tumors. 

SEQ ID NO: 13 provides sequence from a genomic clone which is similar to known rat and mouse cyclophilin 
cDNAs. Rat Cyclophilin nucleic acids (e.g., cDNAs} are known, see.GenBankrM under accession No. 
M19533; Mouse 

Cyclophilin nucleic acids (e.g., cDNAs) are known; see.GenBankTM under accession 
No. 50620 (see, Figure 6). Accordingly, SEQ ID NO: 13 is a putative human cyclophilin gene. The sequence 
is also associated with amplified sequences from 20q13, and can be used as a probe or probe hybridization 
target to detect DNA amplification, or RNA overexpression of the corresponding gene. 

Table 3. Gene fragments identified by exon trapping and analyzed by RT-PCR, 

Northern and Southern blots. 

EMI24.1 



<tb> 

<SEP> EST <SEP> Cloned <SEP> Protein 

<tb> <SEP> Map <SEP> Gene <SEP> Transcript <SEP> Identity <SEP> Sequence <SEP> Homologies 
<SEP> Map 

<tb> <SEP> &num; <SEP> ID <SEP> Size <SEP> (kb) <SEP> ( > 95%) <SEP> (kb) <SEP> Location 

<tb> <SEP> 1 <SEP> 20sa7 <SEP> Yes <SEP> 3 <SEP> PTP <SEP> 3, <SEP> 18, <SEP> 99 

<tb> <SEP> 2 <SEP> 1b11 <SEP> 3.5 <SEP> No <SEP> 1.5 <SEP> novel <SEP> 18, <SEP> 3 <SEP> 

99, <SEP> 69 

<tb><SEP> 3 <SEP> 200.1 

<tb> <SEP> 4 <SEP> 132.1 <SEP> novel<SEP> 132 

<tb> <SEP> 5 <SEP> 132.2 <SEP> 132 

<tt» <SEP> 6 <SEP> 3bf4 <SEP> 3 <SEP> Yes <SEP> 3 <SEP> PTK <SEP> ambiguous 

<tb> <SEP> 7 <SEP> 7.1 <SEP> 7, <SEP> 11,97 <SEP> 

<tb> <SEP> B <SEP> 7.2 <SEP> 2.4 <SEP> 7, <SEP> 11, <SEP> 97 

<tb> <SEP> 9 <SEP> oc49 <SEP> 7, <SEP> 4 <SEP> Yes <SEP> 3.6 <SEP> Kruppel <SEP> 97, <SEP> 
103 

<tb> <SEP> 10 <SEP> cc43 <SEP> 1.4 <SEP> Yes <SEP> 1.8 <SEP> hypothetical <SEP> 97, <SEP> 7 
<SEP> 11 

<tb> <SEP> 1 1 <SEP> et1807 <SEP> 2.5 <SEP> Yes <SEP> 0.7 <SEP> 0.7 <SEP> novel <SEP> 97,9 
<tt» <SEP> 12 <SEP> et2106 <SEP> None <SEP> Yes <SEP> 95, <SEP> 39, <SEP> 38 
<tb> <SEP> 13 <SEP> 41.1 <SEP> detected <SEP> Yes <SEP> 3 <SEP> homeotic <SEP> 95 <SEP> 41 
<SEP> 42 

<tb> <SEP> 14 <SEP> 67.1 <SEP> 7, <SEP> 8, <SEP> 11 <SEP> Yes <SEP> 2Kb <SEP> gene <SEP> 67 
<tb> <SEP> 15 <SEP> 67.2 <SEP> Yes <SEP> cGMP <SEP> reg. <SEP> 67 
<tb> <SEP> 0 <SEP> J <SEP> | <SEP> \ <SEP> protein 
<tb> 20q13 Amplicon Proteins 

As indicated above, this invention also provides for proteins encoded by nucleic acid sequences in the 20q13 
amplicon (e.g., SEQ. ID. Nos: 1-10 and 1213} and subsequences more preferably subsequences of at least 

10 amino acids, preferably of at least 20 amino acids, and most preferably of at least 30 amino acids in 
length. Particularly preferred subsequences are epitopes specific to the20q1 3 proteins more preferably 
epitopes specific to the ZABC1 and Ib1 proteins. Such proteins include, but are not limited to isolated 
polypeptides comprising at toast 10 contiguous amino acids from a polypeptide encoded by the nucleic acids 
of SEQ. ID No. M0 and 12-13 or from the polypeptide of SEQ. ID. No. 11 wherein the polypeptide, when 
presented as an immunogen, elicits the production of an antibody which specifically binds to a polypeptide 
selected from the group consisting of a polypeptide encoded by the nucleic acids of SEQ. ID No. 1-1 0 and 
12-13 or from the polypeptide of SEQ. 

ID. No. 1 1 and the polypeptide does not bind to antisera raised against a polypeptide selected from the 
group consisting of a polypeptide encoded by the nucleic acids of 

SEQ. ID No. 1 -10 and 12 or from the polypeptide of SEQ. ID. No. 1 1 which has been fully immunosorbed 
with a polypeptide selected from the group consisting of a polypeptide encoded by the nucleic acids of SEQ. 
ID No. 1-10 and 12-13 or from the polypeptide of SEQ. ID. No. 1 1. 

A protein that specifically binds to or that is specifically immunoreactive with an antibody generated against a 
defined immunogen, such as an immunogen consisting of the amino acid sequence of SEQ ID N0 1 1 is 
determined in an immunoassay. The immunoassay uses a polyclonal antiserum which was raised to the 
protein of SEQ ID N0 1 1 (the immunogenic polypeptide). This antiserum is selected to have low cross 
reactivity against other similar known polypeptides and any such cross reactivity is removed by 
immunoabsorbtion prior to use in the immunoassay (e.g., by immunosorbtion of the antisera with the related 
polypeptide). 

In order to produce antisera for use in an immunoassay, the polypeptide e.g., the polypeptide of SEQ ID NO 

1 1 is isolated as described herein For example, recombinant protein can be produced in a mammalian or 
other eukaryotic cell line. 

An inbred strain of mice is immunized with the protein of SEQ ID NO 1 1 using a standard adjuvant, such as 
Freunrfs adjuvant, and a standard mouse immunization protocol (see Harlow and Lane, supra). Alternatively, 
a synthetic polypeptide derived from the sequences disclosed herein and conjugated to a carrier protein is 
used as an immunogen. Polyclonal sera are collected and titered against the immunogenic polypeptide in an 
immunoassay, for example, a solid phase immunoassay with the immunogen immobilized on a solid support 
Polyclonal antisera with a titer of 104 or greater are selected and tested for their cross reactivity against 
known polypeptides using a competitive binding immunoassay such as the one described in Hartow and 
Lane, supra, at pages 570-573. Preferably more than one known polypeptide is used in this determination in 
conjunction with the immunogenic polypeptide. 

The known polypeptides can be produced as recombinant proteins and isolated using standard molecular 
biology and protein chemistry techniques as described herein. 

Immunoassays in the competitive binding format are used for cross reactivity determinations. For example, 
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the immunogenic polypeptide is immobilized to a solid support. Proteins added to the assay compete with 
the binding of the antisera to the immobilized antigen. The ability of the a proteins to compete with the 
binding of the antisera to the immobilized protein is compared to the immunogenic polypeptide. The percent 
cross reactivity for the protein is calculated, using standard calculations. Those antisera with less than 10% 
cross reactivity to known polypeptides are selected and pooled. The cross-reacting antibodies are then 
removed from the pooled antisera by immunoabsorbtion with known polypeptide. 

The rmmunoabsorbed and pooled antisera are then used in a competitive binding immunoassay as 
described herein to compare a larger polypeptide to the immunogenic polypeptide. To make this 
comparison, the two polypeptides are each assayed at a wide range of concentrations and the amount of 
each polypeptide required to inhibitSO % of the binding of the antisera to the immobilized protein is 
determined using standard techniques. If the amount of the target polypeptide required is less than twice the 
amount of the immunogenic polypeptide that is required, then the target polypeptide is said to specifically 
bind to an antibody generated to the immunogenic protein. As a final determination of specificity, the pooled 
antisera is fully immunosorbed with the immunogenic polypeptide until no binding to the polypeptide used in 
the immunosorbtion is detectable. The fully immunosorbed antisera is then tested for reactivity with the test 
polypeptide. If no reactivity is observed, then the test polypeptide is specifically bound by the antisera elicited 
by the immunogenic protein. 

Similarly, in a reciprocal experiment, the pooled antisera is Immunosorbed with the test polypeptide. If the 
antisera which remains after the immunosorbtion does not bind to the immunogenic polypeptide (i.e., the 
polypeptide of SEQ ID NO: 11 used to elicit the antisera) then the test polypeptide is specifically bound by 
the antisera elicited by the immunogenic peptide, 

Detection of20qt3 Abnormalities. 

One of skill in the art will appreciate that the clones and sequence information provided herein can be used 
to detect amplifications, or other chromosomal abnormalities, at 20q13 in a biological sample. Generally the 
methods involve hybridization of probes that specifically bind one or more nucleic acid sequences of the 
target amplicon with nucleic acids present in a biological sample or derived from a biological sample. 

As used herein, a biological sample is a sample of biological tissue or fluid containing cells desired to be 
screened for chromosomal abnormalities (e.g. 

amplifications of deletions). In a preferred embodiment, the biological sample is a cell or tissue suspected of 
being cancerous (transformed). Methods of isolating biological samples are well known to those of skin in the 
art and include, but are not limited to, aspirations, tissue sections, needle biopsies, and the like. Frequently 
the sample win be a "clinical sample" which is a sample derived from a patient It will be recognized that the 
term "sample" also includes supernatant (containing cells) or the cells themselves from cell cultures, cells 
from tissue culture and other media in which it may be desirable to detect chromosomal abnormalities. 

In some embodffrtents, a chromosome sample is prepared by depositing cells, either as single cell 
suspensions or as tissue preparation, on solid supports such as glass slides and fixed by choosftg a fixative 
which provides the best spatial resolution of the celts and the optimal hybridization efficiency. In other 
embodiments, the sample is contacted with an array of probes immobilized on a solid surface. 

Making Probes 

Any of the P1 probes listed in Table 1 , the BAG probes listed in Table 2, or the cDNAs disclosed here are 
suitable for use in detecting the 20q13 amplicon. 

Methods of preparing probes are well known to those of skill in the art (see, e.g. 

Sambrook et a L, Mo lecutarC toning.' A Laboratory Manual (2nd ed.), Vols. 1-3, Cold 
Spring Harbor Laboratory, (1989) or Current Protocols in Molecular Biology. F. 

Ausubelet al., ed. Greene Publishing and Wiley-lnterscience, New York (1987)) 

Given the strategy for making the nucleic acids of the present invention, one of skill can construct a variety of 
vectors and nucleic acid clones containing functionally equivalent nucleic acids. Cloning methodologies to 
accomplish these ends, and sequencing methods to verify the sequence of nucleic acids are well known in 
thean. Examples of appropriate cloning and sequencing techniques, and instructions sufficient to direct 
persons of skill through many cloning exercises are found in Berger and Kimmel, Guide to Molecular Cloning 
Techniques, 

Methods inEnrymology volume 152 Academic Press, Inc., San Diego, CA (Berger); 
Sambrook et al. (1989) Molecular Cloning - A Laboratory Manual (2nd ed.) Vol. 1-3, 
Cold Spring Harbor Laboratory, Cold Spring Harbor Press, NY, (Sambrook); and 
Current Protocols in Molecular Biology, F.M. Ausubel eta I., eds., Current 

Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (1994 

Supplement) (Ausubel). Product information from manufacturers of biological reagents and experimental 

equipment also provide information useful in known biological methods. Such manufacturers include the 

SIGMA chemical company (Saint Louts, MO), R&D systems (Minneapolis, MN), PharmaciaLKB 

Biotechnology (Piscataway.NJ), CLONTECH Laboratories, Inc. (Palo Alto, CA), 

Chem Genes Corp.. Aldrich Chemical Company (Milwaukee, Wl), Gten Research, 

Inc., GIB CO BRL Life Technologies, Inc. (Gaithersberg, MD), Ftuka 

Chemica-Biochemao Analytika (Ftuka Chemie AG, Buchs, Switzerland), tnvitrogen. 

San Diego, CA, and Applied B ©systems (Foster City, CA), as well as many other commercial sources known 

to one of skil. 

The nucleic acids provided by this invention, whether RNA, cONA, genomic DNA, or a hybrid of the various 
combinations, are isolated from biological sources or synthesized in vitro. The nucleic acids and vectors of 
the invention are present in transformed or transfected whole cells, in transformed or transfected cell rysates, 
or in a partially purified or substantial pure form. 

In vitroamptification techniques suitable for amplifying sequences to provide a nucleic acid, or for subsequent 
analysis, sequencing or subctoning are known. Examples of techniques sufficient to direct persons of skin 
through such in vitro amplification methods, including the polymerase chain reaction (PCR) the hgase chain 
reaction (LCR).QP -replfcase amplification and other RNA polymerase mediated techniques (e.g., NASBA) 
are found in Berger, Sambrook, and Ausubel, as wen asMuilis et a)., (1987) U.S. Patent No. 4,683,202; PCR 
Protocols A Guide to 

Methods and Applications (tmis etaL eds) Academe: Press Inc. San Diego, CA (1990) (tnnis); Amheim & 
Levinson (October 1. 1990) C & EN 3&47; The Journal OfMH Research (1991) 3. 81-94; (Kwoh et al. (1989) 
Proc. Natl. Acad. Set USA 86, 1 173; GuateOi et al. (1990) Proa Natl. Acad. ScL USA 87, 1874; Lome!) et al. 

(1989) J. Clin. Chem 35, 1826; Undegren et al, (1988) Science 241, 1077-1080; 
Van Brunt (1990) Biotechnology 8, 291-294; Wu and Wallace. (1989) Gene 4. 560; 
BarringereteL (1990) Gene 89, 117, and Sooknanan and Matek( 1995) 

Biotechnology 13: 563-564. Improved methods of cloning in vitro ampfified nucleic acids are described in 
Wallace et al, U.S. Pat No. 5,426,039. Improved methods of amplifying large nucleic acids are summarized 
in Cheng etal. (1 994) Nature 369: 684-685 and the references therein. 

Nucleic Acids (e.g., oUgonucteotides) for in vitro amplification methods or for use as gene probes, for 
example, are typically chemically synthesized according to the so&d phase rjrtosphoramidrte master method 
described by Beaucage and Caruthers (1981), Tetrahedron Letts.. 22(20): 1859-1862, e.g.. using an 
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automated synthesizer, as described in Noodham -VanDevanter et al. (1984) Nucleic 
Acids Res., 12:6159-6168. Purification of oligonucleotides, where necessary, is typically performed by either 
native aery lam ide gel electrophoresis or by anion-exchange HPLC as described in Pearson and Regnier 
(1983) J. Chrom. 

255:137-149. The sequence of the synthetic oligonucleotides can be verified using the chemical degradation 
method of Maxam and Gilbert (1980) in Grossman and Moldave (eds.) Academic Press, New York, Methods 
in Enzymology 65:499-560. 

The probes are most easily prepared by combining and labeling one or more of the constructs listed in 
Tables I and 2. Prior to use, the constructs are fragmented to provide smaller nucleic acid fragments that 
easily penetrate the cell and hybridize to the target nucleic acid. Fragmentation can be by any of a number of 
methods well known to hose of skill in thean. Preferred methods include treatment with a restriction enzyme 
to selectively cleave the molecules, or alternatively to briefly heat the nucleic adds in the presence ofMug2+ 
Probes are preferably fragmented to an average fragment length ranging from about 50 bp to about 2000 bp, 
more preferably from about 100 bp to about 1000 bp and most preferably from about 150 bp to about 500 bp. 

Alternatively, probes can be produced by amplifying ( e.g. via PCR) selected subsequences from the 20ql3 
ampBcon disclosed herein. The sequences provided herein permit one of skill to select primers that amplify 
sequences from one or more exons located within the 20q13 amplicon. 

Particularly preferred probes include nucleic acids from probes 36, 40, and 79, which corresponds to 
RMC20C001. In addition, the cDNAs are particularly useful for identifying cells that have increased 
expression of the corresponding genes, using for instance, Northern blot analysis. 

One of skill will appreciate that using the sequence information and clones provided herein, one of skill in the 
art can isolate the same or similar probes from other human genomic libraries using routine methods (e.g. 
Southern or Northern 
Blots). 

Similarly, the polypeptides of the invention can be synthetically prepared in a wide variety of well-know ways. 
For instance, polypeptides of relatively short length can be synthesized in solution or on a solid support in 
accordance with conventional techniques. See, e.g., MorTifield (1963) J. Am. Chem. 

Soc. 85:2149-2154. Various automatic synthesizers are commercially available and can be used in 
accordance with known protocols. See, e.g., Stewart and Young (1984) Solid Phase Peptide Synthesis, 2d. 
ed., Pierce Chemical Co. As described in more detail herein, the polypeptide of the invention are most 
preferably made using recombinant techniques, e.g., by expressing the polypeptides in host cells and 
purifying the expressed proteins. 

In a preferred embodiment, the polypeptides, or subsequences thereof are synthesized using recombinant 
DNA methodology. Generally this involves creating a DNA sequence that encodes the protein, through 
recombinant, synthetic or in vitro amplification techniques, placing the DNA in an expression cassette under 
the control of a particular promoter, expressing the protein in a host cell, isolating the expressed protein and, 
if required, renaturing the protein. 

Labelling Nucleic Acids 

Methods of labeling nucleic acids (either probes or sample nucleic acids) are well known to those of skill in 
thean. Preferred labeled labels are those that are suitable for use in in situ hybridization. The nucleic acid 
probes or samples of the invention may be detectably labeled prior to the hybridization reaction. 

Alternatively, a detectable label which binds to the hybridization product may be used. Such detectable 
labels include any material having a detectable physical or chemical property and have been well-developed 
in the field of immunoassays. 

As used herein, a Taber is any composition detectable by spectroscopic, photochemical, biochemical, 
immunochemical, or chemical means. 

Useful labels in the present invention include radioactive labels (e.g. 32p,125l, 14C, 3H, and 35S), 
fluorescent dyes (e.g. fluorescein, rhodarnine, Texas Red, etc), electron-dense reagents (e.g. gold), 
enzymes (as commonly used in anEUSA), colorimetric labels (e.g. colloidal gold), magnetic labels 
(e.g.DynabeadsTM), and the like. Examples of labels which are not directly detected but are detected 
through the use of directly detectable label include biotin and dbxigenin as well as haptens and proteins for 
which labeled antisera or monoclonal antibodies are available. 

The particular label used is not critical to the present invention, so long as it does not interfere with the in situ 
hybridization of the stain. However, stains directfy labeled with fluorescent labels (e.g. fluoresced 2-dUTP, 
TexasRed-&-dU"P t etc.) are preferred for chromosome hybridization 

A direct labeled probe, as used herein, is a probe to which a detectable label is attached. Because the direct 
label is already attached to the probe, no subsequent steps are required to associate the probe with the 
detectable label. In contrast, an indirect labeled probe is one which bears a moiety to which a detectable 
label is subsequently bound, typically after the probe is hybridized with the target nucleic acid. 

In addition the label must be detect ible in as tow copy number as possible thereby maximizing the sensitivity 
of the assay and yet be detect ible above any background signal. Finally, a label must be chosen that 
provides a highly localized signal thereby providing a high degree of spatial resolution when physically 
mapping the stain against the chromosome. Particularly preferred fluorescent labels include fluoresce m-1 2- 
dUTP and Texas Red-5-dUTP. 

The labels may be coupled to the probes in a variety of means known to those of skill in the art In a 
preferred embodiment the nucleic acid probes will be labeled using nick translation or random primer 
extension (Rigby, et al. J. MoL BioL, 113: 237 (1977) or Sambrook, et aL, Molecular Cloning - A Laboratory 
Manual, 

Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1985)). 

One of skill n the art will appreciate that the probes of this invention need not be absolutely specific for the 
targeted20ql3 region of the genome. Rather, the probes are intended to produce "staining contrast". 
"Contrast" is Cfuarrtrfied by the ratio of the probe intensity of the target region of the genome to that of the 
other portions of the genome. For example, a DNA Itorary produced by doning a particular chromosome (e.g. 
chromosome 7) can be used as e stain capable of staining the entire chromosome. The Ifcrary contains both 
sequences found only on that chromosome, and sequences shared with other chromosomes. Roughly half 
the chromosomal DNA falls into each class, tf hybridization of the whole Ifcrary were capable of saturating all 
of the binding sites on the target chromosome, the target chromosome would be twice as bright (contrast 
ratio of 2) as the other chromosomes since it would contain signal from the both the specific and the shared 
sequences in the stain, whereas the other chromosomes would only be stained by the shared sequences. 
Thus, only a modest decrease an hybridization of the shared sequences in the stain would substantiaUy 
enhance the contrast Thus contaminating sequences which only hybridize to non-targeted sequences, for 
example, impurities in a library, can be tolerated r\ the stain to the extent that the sequences do not reduce 
the staining contrast below useful levels. 

Detecung the20g1 3 AmpGcon. 
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As explained above, detection of ampl ification in the 20q13 am pi icon is 
indicative of the presence and/or prognosis of a targe number of cancers. These 
include, but are not limited to breast, ovary, bladder, head and neck, and colon. 

In a preferred embodiment, a 20ql3 amplification is detected through 
the hybridization of a probe of this invention to a target nucleic acid (e.g. a 

chromosomal sample) in which it is desired to screen for the amplification. Suitable hybridization formats are 
well known to those of skill in the art and include, but are not limited to, variations of Southern Blots, in situ 
hybridization and quantitative amplication methods such as quantitative PCR (see, e.g. Sam brook, supra, 
Kalfioniemi et al., Proc. Natl Acad Sci USA, 89: 5321^5325 (1992), and PCR 
Protocols, A Guide to Methods arxJAppKcanons, Innis et al., Academic Press, Inc. 

N.Y., (1990)). 

In situ Hybridization. 

In a preferred embodiment, the 20q13 amplicon is identified using in situ hybridization. Generally, in situ 
hybridization comprises the following major steps: (I) fixation of tissue or biological structure to analyzed; (2) 
prehybridization treatment of the biological structure to increase accessibility of target DNA, and to reduce 
nonspecific binding; (3) hybridization of the mixture of nucleic acids to the nucleic acid in the biological 
structure or tissue; (4) posthybridization washes to remove nucleic acid fragments not bound in the 
hybridization and (5) detection of the hybridized nucleic acid fragments. The reagent used in each of these 
steps and their conditions for use vary depending on the particular application. 

In some applications it is necessary to block the hybridization capacity of repetitive sequences. In this case, 
human genomic DNA is used as an agent to block such hybridization. The preferred size range is from about 
200 bp to about 1000 bases, more preferably between about 400 to about 800 bp for double stranded. 

nick translated nucleic acids. 

Hybridization protocols for the particular applications disclosed here are described in Pinkel et al. Proc. Natl. 
Acad. Sci. USA. 85: 9138-9142 (1988) and in EPO Pub. No. 430,402. Suitable hybridization protocols can 
also be found in 

Methodsotin Molecular Biology Vol. 33: In Situ Hybridization Protocols, K.H.A. 

Choo, ed . Humana Press, Totowa, New Jersey, (1994). In a particularty preferred embodiment, the 
hybridization protocol of Kallioniemi et al., Proc. Natl Acad Sci 
USA, 89: 5321-5325 (1992) is used. 

Typically, it is desirable to use dual color FISH, in which two probes are utilized, each labeled by a different 
fluorescent dye. A test probe that hybridizes to the region of interest is labeled with one dye, and a control 
probe that hybridizes to a different region is labeled with a second dye. A nucleic acid that hybridizes to a 
stable portion of the chromosome of interest, such as the centromere region, is often most useful as the 
control probe. In this way, differences between efficiency of hybridization from sample to sample can be 
accounted for. 

The FISH methods for detecting chromosomal abnormalities can be performed on nanogram quantities of 
the subject nucleic acids. Paraffin embedded tumor sections can be used, as can fresh or frozen material. 
Because FISH can be applied to the limited material, touch preparations prepared from uncultured primary 
tumors can also be used (see, e.g., Kallioniemi, A. et al.,Cytogenet. Cell Genet. 60: 190-193 (1992)). For 
instance, small biopsy tissue samples from tumors can be used for touch preparations (see, e.g., Kallioniemi, 
A. et al, Cyrogenet. Cell Genet. 60: 190-193 (1992)). Small numbers of cells obtained from aspiration biopsy 
or cells in bodily fluids (e g., blood, urine, sputum and the like) can also be analyzed For prenatal diagnosis, 
appropriate samples wiP include amniotic fluid and the like. 

Other Formats 

A number of hybridization formats are useful in the invention. For instance, Southern hybridizations can be 
used. In a Southern Blot, a genomic or cDNA (typically fragmented and separated on an electrophoretic gel) 
is hybridized to a probe specific for the target region. Comparison of the intensity of the hybridization signal 
from the probe for the target region (e.g., 20q13) with the signal from a probe directed to a control (non 
amplified) such as centromeric DNA, provides an estimate of the relative copy number of the target nucleic 
acid. 

Other formats use arrays of probes or targets to which nucleic acid samples are hybridized as described, for 
instance, in WO 96/17958. As used herein, a "nucleic acid array" is a plurality of target elements, each 
comprising one or more target nucleic acid molecules immobilized on a solid surface to which probe nucleic 
acids are hybridized. Target nucleic acids of a target element typically have their origin in the 20q13 
amplicon. The target nucleic acids of a target element may, for example, contain sequence from specific 
genes or clones disclosed here. Target elements of various dimensions can be used in the arrays of the 
invention. Generally, smaller, target elements are preferred. Typically, a target element wifl be less than 
abouttcm in diameter. Generally element sizes are fromlym to about 3mm, preferably between about 5 m and 
abouttmm. 

The target elements of the arrays may be arranged on the solid surface at different densities. The target 
element densities will depend upon a number of factors, such as the nature of the label, the solid support, 
and the (See. One of skin wiD recognize that each target element may comprise a mixture of target nucleic 
acids of different lengths and sequences. Thus, for example, a target element may contain more than one 
copy of a cloned piece of DNA, and each copy may be broken into fragments of different lengths. The length 
and complexity of the target sequences of the invention is not critical to the invention. One of skill can adjust 
these factors to provide optimum hybridization and signal production for a given hybridization procedure, and 
to provide the required resolution among different genes or genomic locations. Typically, the target 
sequences will have a complexity between about 1 kb and about 1 Mb, sometimes 10kb and about 500kb, 
and usually from about 50kb to about 150kb. 

Detecting Mutations in Genes from the 20ql3 Amplicon 

The cDNA sequences disclosed here can also be used for detecting mutations (e.g., substitutions, insertions, 
and deletions) within the corresponding endogenous genes. One of skiD will recognize that the rtudeic acid 
hybridization techniques genera By described above can be adapted to detect such much mutations. 

For instance, oligonucleotide probes that distinguish between mutant and wild-type forms of the target gene 
can be used in standard hybridization assays. In some embodiments, amplification (e.g., using PCR) can be 
used to increase copy number of the target sequence prior to rrybridizaton. 

Assays for detecting20g13 amplcon proteins. 

As indicated above, this invention identifies protein products of genes in the 

These may include analytic biochemical methods such as electrophoresis, capilary electrophoresis, high 
performance liquid chromatography (HPLC). thin layer chromatography (TLC), hyperdiffusion 
ctrwnatography. and the like, or various immunological methods such as fluid or gel precipitin reactions. 
oTununodiffusion (single or double), jmnuirtoefectrophoresis, radloffnmunoassayfRIA), enzyme-Bnked 
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immunosorbent assays(EUSAs), immunofluorescent assays, western blotting, and the like. 

In one preferred embodiment, the 20q13 arnplicon proteins are detected in an electrophoretic protein 
separation such as a one dimensional or two-dimensional electrophoresis, white in a most preferred 
embodiment, the 20q1 3 ampltcon proteins are detected using an immunoassay. 

As used herein, an immunoassay is an assay that utilizes an antibody to specifically bind to the anafyte (e.g., 
ZABC1 oribl proteins). The immunoassay is thus characterized by detection of specific binding of a 20q 13 
am pi icon protein to anant>-20q1 3 ampltcon antibody (e.g.. anti-ZABC1 or anti-lbl) as opposed to the use of 
other physical or chemical properties to isolate, target, and quantify the anaJyte. 

The collection of biological sample and subsequent testing for 20q 1 3 ampticon prolein(s) is discussed in 
more detail below. 

A) Sample Collection and Processing 

The2or 13 arnplicon proteins are preferably quantified in a biological sample derived from a mammal, more 
preferably from a human patient or from a porcine, murine, feline, canine, or bovine. As used herein, a 
biological sample is a sample of biological tissue or fluid that contains a 20q 13 arnplicon protein 
concentration that may be correlated with a20q13 amplification. Particularly preferred biological samples 
include, but are not limited to biological fluids such as blood or urine, or tissue samples including, but not 
limited to tissue biopsy (e.g., needle biopsy) samptes. 

The biological sample may be pretreated as necessary by dilution in an appropriate buffer solution or 
concentrated, if desired. Any of a number of standard aqueous buffer solutions, employing one of a variety of 
buffers, such as phosphate, Tris. 

or the like, at physiological pH can be used. 

B) Electrophoretic Assays. 

As indicated above, the presence or absence of 20q 13 arnplicon proteins in a biological tissue may be 
determined using electrophoretic methods. Means of detecting proteins using electrophoretic techniques are 
well known to those of skiD in the art (see generally. R. Scopes (1 982) ProteinPurification, Springer-Venag, 
N.Y., 

Deutscher, (1990) Methods inEnzymology Vol. 182: Guide to ProteinPurification, 
Academic Press, lnc.,N.Y.). In a preferred embodiment, the20qQ arnplicon proteins are detected using one- 
dimensional or two-dimensional electrophoresis. A particularly preferred two-dimensional electrophoresis 
separation relies on isoelectric focusing(IEF) in immobilized pH gradients for one dimension and 
potyacrylamide gels for the second dimension. Such assays are described in the cited references and by 
Patton el al.(1990) 
BiotechniquesS 518. 

C) lntinunological BtndingAssavs. 

In a preferred embodiment, the 20q13 arnplicon are detected and/or quantified using any of a number of well 
recognized immunological binding assays (see, e.g.. U.S. Patents 4,366.241; 4.376,110; 4,517,288; and 
4,837,168). For a review of the general immunoassays, see also Methods in Cell Biology Votume37. 

Antibodies in Cell Biology, Asai, ed. Academic Press, Inc. New York (1993); Basic and Clinical Immunology 
7th Edition, Stites & Terr, eds. (1991). 

Immunological binding assays (or immunoassays) typically utilize a "capture agent" to specifically bind to and 
often immobilize the analyte (in this case20q1 3 arnplicon). The capture agent is a moiety that specifically 
binds to the analyte. 

tn a preferred embodiment, the capture agent is an antfoody that specifically binds20q!3 arnplicon protein(s). 

The antibody (e.g., antKZABCI or anti-Jbl) may be produced by any of a number of means well known to 
those of skill in the art (see, e.g. Methods in 

Cell Biology Volume 37. Antibodies in Cell Biology, Asai, ed. Academic Press, Inc. 

New York (1993); and Basic and Clinical Immunology 7th Edition, Stites & Terr, eds. (1991)). The antfoody 
may be a whole antibody or an antibody fragment. H may be polyclonal or monoclonal, and it may be 
produced by chalenging an organism (e.g. mouse, rat, rabbit, etc) with a 20q 13 arnplicon protein or an 
epitope derived therefrom. Alt amatively, the antibody may be produced de novo using recombinant 
DNA methodology. The antibody can also be selected from a phage display Itorary screened against 20q13 
arnplicon (see, e.g. Vaughan er al. (1996) Nature 
Biotechnology, 14: 309-314 and references therein). 

Immunoassays also often utilize a labeling agent to specially bind to and label the binding complex formed 
by the capture agent and the analyte. The labeling agent may riser! be one of the moieties comprising the 
antibody/analyte complex. Thus, the labeling agent may be a labeled 20q13 ampticon protein or a labeled 
anti-20q1 3 arnplicon antibody. Alternatively, the labeling agent may be a third moiety, such as another 
antibody, that specifically binds to the antibody/20qt3 arnplicon protein complex, tn a preferred embodiment, 
the labeling agent is a second human 20q13 ampacon protein antibody bearing a label. Alternatively, the 
second 20q13 arnplicon protein antibody may lack a label, but it may, in turn, be bound by a labeled third 
antibody specific to antibodies of the species from which the second antibody is derived. The second can be 
modified with a detectable moiety, such as biotin. to which a third labeled molecule can specifically bind, 
such asenzyme- labeled streptavidin. 

Other proteins capable of specifically binding immunoglobulin constant regions, such as protein A or protein 
G may also be used as the label agent These proteins are normal constituents of the ceO waits of 
streptococcal bacteria. They exhibit a strong rtorwmmunogenic reactivity with immunoglobulin constant 
regions from a variety of species. See. generally Kronval. etal, J. Immunol, 111:14011408 (1973), and 
Akerstrorn, et al., I Immunol., 135:2589-2542 (1985). 

Throughout the assays, incubation and/or washing steps may be required after each combination of 
reagents. Incubation steps can vary from about 5 seconds to several hours, preferably from about 5 minutes 
to about 24 hours. 

However, the incubation time wiD depend upon the assay format, analyte, volume of solution, concentrations, 
and the Eke. Usually, the assays will be carried out at ambient temperature, although they can be conducted 
over a range of temperatures. 

suchas10Cto40*C. 

1)NorvCompetitive Assav Formats 

Immunoassays for detecting 20q13 arnplicon proteins may be either competitive or noncompetitive. 
Noncompetitive immunoassays are assays in which the amount of captured analyte (in this case 20q1 3 
ampticon) is directly measured, tn one preferred "sandwich" assay, for example, the capture agent (anti- 
20ql3 ampticon protein antibodies) can be bound directly to a soSd substrate where they are immobilized 
These immobilized antibodies then capture 20q13 ampticon protein present in the test sample. The 20q13 
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amplicon protein thus immobilized is then bound by a labeling agent, such as a second human 20q13 
ampDcon protein antibody bearing e label. Alternatively, the second20ql3 amplicon protein antibody may lack 
a label, but it may. in turn, be bound by a labeled third antibody specific to antibodies of the species from 
which the second antibody is derived. The second can be modified with a detectable moiety, such as biotin, 
to which a third labeled molecule can specifically bind, such as enzyme-labeled streptavtdtn. 



* 2. Competitive assay formats. 

tn competitive assays, the amount of analyte (20q13 amplicon protein) present in the sample is measured 
indirectly by measuring the amount of an added (exogenous) analyte (20q1 3 amplicon proteins such as 
ZABC1 or1b1 protein) displaced (or competed away) from a capture agent (e.g., antt-ZABC1 or antMbl 
antibody) by the analyte present in the sample. In one competitive assay, a known amount of, in this case, 
20q13 amplicon protein is added to the sample and the sample is then contacted with a capture agent, in this 
case an antibody that specifically binds 20q13 amplicon protein. The amount of 20q13 amplicon protein 
bound to the antibody is inversely proportional to the concentration of20q1 3 amplicon protein present in the 
sample. 

tn a particularly preferred embodiment, the anti-20ql 3 protein antibody is immobilized on a solid substrate. 
The amount of20q13 amplicon protein bound to the antibody may be determined either by measuring the 
amount of20ql3 amplicon present in an20q13 amplicon protein/antibody complex, or alternatively by 
measuring the amount of remaining uncomplexed 20q13 amplicon protein. The amount of 20q13 amplicon 
protein may be detected by providing a labeled 20q13 amplicon protein. 

A hapten inhibition assay is another preferred competitive assay. In this assay a known analyte, in this case 
20q13 amplicon protein is immobilized on a solid substrate. A known amount of anti-20q13 amplicon protein 
antibody is added to the sample, and the sample is then contacted with the immobilized 20q1 3 amplicon 
protein In this case, the amount of anti-20ql3 amplicon protein antibody bound to the immobilized 20q1 3 
amplicon protein is inversely proportional to the amount of20q13 amplicon protein present in the sample. 
Again the amount of immobilized antibody may be detected by detecting either the immobilized fraction of 
antibody or the fraction of the antibody that remains in solution. Detection may be direct where the antibody 
is labeled or indirect by the subsequent addition of a labeled moiety that specifically binds to the antibody as 
described above. 

3. Other Assay Formats 

tn a particularly preferred embodiment, Western blot (immunobtot) analysis is used to detect and quantify the 
presence of 20q13 amplicon protein in the sample. The technique generally comprises separating sample 
proteins by gel electrophoresis on the basis of molecular weight, transferring the separated proteins to a 
suitable solid support, (such as a nitrocellulose filter, a nylon fitter, or derivatized nylon fitter), and incubating 
the sample with the antibodies that specifically bind20q13 amplicon protein. Theanti-20q1 3 amplicon protein 
antibodies specifically bind to 20q13 amplicon protean on the solid support These antibodies may be directly 
labeled or alternatively may be subsequently detected using labeled antibodies (e.g., labeled sheep anti- 
mouse antibodies) that specifically bind to the anti-20q13 amplicon protein. 

Other assay formats include liposome immunoassays (LIA), which use liposomes designed to bind specific 
molecules (e.g., antibodies) and release encapsulated reagents or markers. The released chemicals are 
then detected according to standard techniques (see, Monroe et al. (1986) Amer. Clin. Prod. Rev. 5:34-41). 

D) Reduction of Non-Specific Binding. 

One of skill in the art will appreciate that it is often desirable to reduce non-specific binding in immunoassays. 
Particularly, where the assay involves an antigen or antibody immobilized on a solid substrate it is desirable 
to minimize the amount of non-specific binding to the substrate. Means of reducing such non-specific binding 
are well known to those of skill in the art Typically, this involves coating the substrate with a proteinaceous 
composition. In particular, protein compositions such as bovine serum abumin (BSA), nonfat powdered milk, 
and gelatin are widely used with powdered milk being most preferred. 

E) Labels. 

The particular label or detectable group used in the assay is not a critical aspect of the invention, so long as 
it does not significantly interfere with the specific binding of the antibody used in the assay. The detectable 
group can be any material having a detectable physical or chemical property. Such detectable labels have 
been wei-developed in the field of immunoassays and, in general, most any label useful in such methods 
can be applied to the present invention. Thus, a label is any composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the 
present invention include magnetic beads (e.g.DynabeadsTM), fluorescent dyes (e.g., fluorescein 
isolhiocyanate, texas red, rhodamine, and the like), radiolabels (e.g., 3H.1251, 35S, 14C, or 32p), enzymes 
(e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and 
colorimetric labels such as colloidal gold or colored glass or plastic (e.g. polystyrene, polypropylene, 
latex, etc.) beads. 

The label may be coupled directly or indirectly to the desired component of the assay according to methods 
well known in the art As indicated above, a wide variety of labels may be used, with the choice of label 
depending on sensitivity required, ease of conjugation with the compound, stability requirements. 

available instrumentation, and disposal provisions. 

Nonradioactive labels are often attached by indirect means. Generally, a ftgand motecute(e.g., biotin) is 
covalenUy bound to the molecule. The Bgand then binds to an artti-Ggand (e.g., streptavidrn) molecule which 
is either inherently detectable or covalerttfy bound to a signal system, such as a detectable enzyme, a 
fluorescent compound, or a chemiluminescent compound. A number of Dgands and anti-figands can be used. 
Where a ligand has a natural antWigand, for example, biotin, thyroxine, and Cortisol, it can be used in 
conjunction with the labeled, naturally occurring antMigands. Alternatively, any haptenic or antigenic 
compound can be used in combination with an antibody. 

The molecules can also be conjugated directly to signal generating compounds, e.g., by conjugation with an 
enzyme or fhjorophore. Enzymes of interest as labels will primarily be hydrolases, particularly phosphatases, 
esterases and gtycosidases, or oxtdoreductases. particularly peroxidases. Fluorescent compounds include 
fluorescein and its derivatives, rhodamine and its derivatives, dansyl, umbefliferone, etc. Chemikjminescent 
compounds include luciferin, and 2,d9iydrophthatazinedtones, e.g.. lumino). For a review of various labeling 
or signal producing systems which may be used, see, U.S. Patent No. 4,391 ,904). 

Means of detecting labels are well known to those of skill in the art. 

Thus, for example, where the label is a radbactive label, means for detection indude a scintillation counter 
or photographic fim as in autoradiography. Where the (abet is a fluorescent label, it may be detected by 
exciting the fluorochrome with the appropriate wavelength of fight and detecting the resulting fluorescence. 
The fluorescence may be detected visually, by means of photographic film, by the use of electronic detectors 
such as charge coupled devices (CCDs) or photornuttipfiers and the Bee. Similarly, enzymatic labels may be 
detected by providrig the appropriate substrates for the enzyme and detecting the resulting reaction product 
FinaRy simple cotorimetric labels may be detected simply by observing the color associated with the labeL 
Thus, in various dipstick assays, conjugated gold often appears pink, while various conjugated beads appear 
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the color of the bead 

Some assay formats do not require the use of labeled components. For instance, agglutination assays can 
be used to detect the presence of the target antibodies. In this case, antigen-coated particles are 
agglutinated by samples comprising the target antibodies. In this formal, none of the components need be 
labeled and the presence of the target antibody is detected by simple visual inspection. 

G) Substrates 

As mentioned above, depending upon the assay, various components. 

including the antigen, target antibody, or anti-human antibody, may be bound to a solid surface. Many 
methods for immobilizing btomolecutes to a variety of solid surfaces are known in the art. For instance, the 
solid surface may be a membrane (e.g., nitrocellulose), a microliter dish (e.g., PVC, polypropylene, or 
polystyrene), a test tube (glass or plastic), a dipstick (e.g. glass, PVC, polypropylene, polystyrene, latex, and 
the like), a microcentrifuge tube, or a glass or plastic bead. The desired component may be covalent ry bound 
or noncovalentry attached through nonspecific bonding. 

A wide variety of organic and inorganic polymers, both natural and synthetic may be employed as the 
material for the solid surface. Illustrative polymers include polyethylene, polypropylene, poty(4- 
methylbutene), polystyrene, porymethacrytate, polyethylene terephthalate), rayon, nylon, polyvinyl 
butyrate), polyvinylidene difluorida (PVDF), sificones, poryformaldehyde, cellulose, cellulose acetate, 
nitrocellulose, and the like. Other materials which may be employed, include paper, glasses, ceramics, 
metals, metalloids, semiconductive materials, cements or the like. In addition, are included substances that 
form gels, such as proteins (e.g., gelatins), Gpopolysaccharides, silicates, agarose and poryacrylamides can 
be used 

Polymers which form several aqueous phases, such as dextmns, poryalkylene glycols or surfactants, such 
as phospholipids, long chain (12-24 carbon atoms) alkyl ammonium salts and the like are also suitable. 
Where the solid surface is porous, various pore sizes may be employed depending upon the nature of the 
system. 

In preparing the surface, a plurality of different materials may be employed, particularly as laminates, to 
obtain various properties. For example, protein coatings, such as gelatin can be used to avoid non-specific 
binding, simplify covalent conjugation, enhance signal detection or the like. 

If covalent bonding between a compound and the surface is desired, the surface will usually be 
pofyfunct tonal or be capable of being polyfunctionalized. 

Functional groups which may be present on the surface and used for linking can include carboxylic acids, 
aldehydes, amino groups, cyano groups, ethylenic groups. 

hydroxy! groups, mercapto groups and the like. The manner of linkng a wide variety of compounds to 
various surfaces is well known and is amply illustrated in the literature. See, for examples, Immobilized 
Enzymes, Ichiro CNbata. Halsted Press, 

New York, 1978, and C uatrecasas (1 970) J. Biol. Chem. 245 3059). 

tn addition to covalent bonding, various methods for noncovatently binding an assay component can be 
used. Noncovalent binding is typicaDy nonspecific absorption of a compound to the surface. Typically, the 
surface is blocked with a second compound to prevent nonspecific binding of labeled assay components. 
Alternatively, the surface is designed such that it nonspedficalty binds one component but does not 
significantly bind another. For example, a surface bearing a lectin such as Concanavaltn A win bind a 
carbohydrate containing compound but not a labeled protein that tacks glycosylation. Various solid surfaces 
for use in noncovalent attachment of assay components are reviewed in U.S. Patent 
Nos. 4,447,576 and 4,254,082. 

KitsContamlng 20g13 Am pi icon Probes. 

This invention also provides diagnostic kits for the detection of chromosomal abnormalities at 20q13. tn a 
preferred embodiment, the kits include one or more probes to the 20q1 3 amplicon and/or antibodies to a 
20q13 amplicon (e.g., anti-ZABC 1 orartn-lbl) described herein. The kits can additionally include blocking 
probes, instructional materials describing how to use the kit contents in deteding20q1 3 amplcons. The kits 
may also include one or more of the following: various labels or labeling agents to facilitate the detection of 
the probes, reagents for the hybridization including buffers, a metaphase spread, bovine serum albumin 
(BSA) and other blocking agents, sampling devices including fine needles, swabs, aspirators and the like, 
positive and negative hybridization controls and so forth. 

Expression of cONA clones 

One may express the desired polypeptides encoded by the cDNA clones disclosed here, or by subclontng 
cONA portions of genomic sequences in a recombinantty engineered cell such as bacteria, yeast, insect 
(especially emptoyingoaculoviral vectors), or mammalian cell. It is expected that those of skill in the art are 
knowledgeable in the numerous expression systems available for expression of the cDNAs. No attempt to 
describe in detail the various methods known for the expression of proteins in prokaryotes or eukaryotes will 
be made. 

tn brief summary, the expression of natural or synthetic nucleic acids encoding polypetides of the invention 
wiD typically be achieved by operabty (inking the DMA or cONA to a promoter (which is either constitutive or 
inducible), followed by incorporation into an expression vector. The vectors can be suitable for replication 
and integration in either prokaryotes or eukaryotes. Typical expression vectors contain transcription and 
translation terminators, initiation sequences, and promoters useful for regulation of the expression of the 
ONA encoding the polypeptides. To obtain high level expression of a cloned gene, it is desirable to construct 
expression plasm ids which contain, at the minimum, a strong promoter to direct transcription, a ribosome 
binding site for translatxinal initiation, and a trartsoiption/marwlation terminator. 

Examples of regulatory regions suitable for this purpose in E. coti are the promoter and operator region of the 
E. coti tryptophan btosynthetic pathway as described by Yanofsky, C, 1984. J. BactenoL, 158:1018-1024 
and the leftward promoter of phage lambda (Px ) as described by Herskowitz, I. and Hagen. D.. 1980, 
Ana Rev. Genet, 14:399-445. The inclusion of selection markers in DNA vectors transformed in E. coti is 
also useful. Examples of such markers include genes specifying resistance to ampidllin, tetracycline, or 
chloramphenicol. Expression systems are available using E. coD, Bacillus sp. (Patva, let a)., 1983, Gene 
22:229-235; Mosbach, K. et al. Nature, 302:543-545 and SataioneBa. E. coti systems are preferred. 

The polypeptides produced by prokaryote cells may not necessarify f old property. During purification from E. 
colt, the expressed potypeptides may first be denatured and then renatured. This can be accomplished by 
solubflizing the bacterial ly produced proteins in a chaotropic agent such as guantdineHCI and reducing all 
the cysteine residues with a reducing agent such as b^ta-mercaptoethanoL 

The polypeptides are then renatured, either by slow dialysis or by gel filtration. U.S. 

Patent No. 4.511.503. 

A variety of eukaryotic expression systems such as yeast, insect ceU lines and mammaSan cells, are known 
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to those of skill in the art As explained briefly below, the polypeptides may atso be expressed in these 
eukaryotic systems. 

Synthesis of heterologous proteins in yeast is well known and described. Methods in Yeast Genetics. 
Sherman, F., et al., Cold Spring Harbor 

Laboratory. (1982) is a well recognized work describing the various methods available to produce the 
polypeptides in yeast A number of yeast expression plasm ids like 

YEp6, YEpl3, YEp4 can be used as vectors. A gene of interest can be fused to any of the promoters in 
various yeast vectors. The above-mentioned plasmids have been fully described in the literature (Botstein, et 
al., 1979, Gene,8: 17-24: Broach, er al., 
1979. Gene, 6:121-133). 

Illustrative of cell cultures useful for the production of the polypeptides are cells of insect or mammalian 
origin. Mammalian cell systems often will be in the form of monolayers of cells although mammalian cell 
suspensions may also be used. 

Illustrative examples of mammalian cell lines include VERO and HeLa cells, Chinese hamster ovary (CHO) 
cell lines, W138.BHK, Cos-7 or MDCK cell lines. 

As indicated above, the vector, e. g., a plasm id, which is used to transform the host ceS, preferably contains 
DNA sequences to initiate transcription and sequences to control the translation of the antigen gene 
sequence. These sequences are referred to as expression control sequences. When the host cell is of insect 
or mammalian origin illustrative expression control sequences are often obtained from the SV-40 promoter 
(Science, 222:524-527, 1983), the CMV I.E. 

Promoter (Proc. Natl. Acad. Sci. 81:659-663, 1984) or the metallothbnein promoter (Nature 296:39^2, 
1982). The cloning vector containing the expression control sequences is cleaved using restriction enzymes 
and adjusted in size as necessary or desirable and Boated with the desired DNA by means well known in the 
art. 

As with yeast, when higher animal host cells are employed, polyadenlyation or transcription terminator 
sequences from known mammalian genes need to be incorporated into the vector. An example of a 
terminator sequence is thepolyadenlyation sequence from the bovine growth hormone gene. Sequences for 
accurate splicing of the transcript may also be included. An example of a splicing sequence is the VPI intron 
from SV40 (Sprague, J. et al., 1983, J. ViroL 45:773-781). 

Additionally, gene sequences to control replication in the host cell may be incorporated into the vector such 
as those found in bovine papilloma virus type-vectors. Saveria-Campo, M., 1985. "Bovine Papilloma virus 
DNA a Eukaryotic 

Cloning Vector" in DNA Cloning Vol. II a Practical Approach Ed. D.M. Glover, 
tRL Press, Arlington, Virginia pp. 213-238. 

Therapeutic and other uses of cDNAs and their gene products 

The cDNA sequences and the polypeptide products of the invention can ' 

be used to modulate the activity of the gene products of the endogenous genes corresponding to the cDNAs. 
By modulating activity of the gene products, 

pathological conditions associated with their expression or tack of expression can be treated. Any of a 
number of techniques well known to those of skill in the art can be used for this purpose. 

The cDNAs of the invention are particularly used for the treatment of various cancers such as cancers of the 
breast, ovary, bladder, head and neck, and colon. Other diseases may also be treated with the sequences of 
the invention. For instance, as noted above, GCAP (SEQ. ID. No. 6) encodes a guanino cyclase activating 
protein which is involved in the biosynthesis of cyclic AMP. Mutations in genes involved in the biosynthesis of 
cyclic AMP are known to be associated with hereditary retinal degenerative diseases. These diseases are a 
group of inherited conditions in which progressive, bilateral degeneration of retinal structures leads to loss of 
retinal function. These diseases include age-related macular degeneration, a leading cause of visual 
impairment in the elderty; Leber's congenital amaurosis, which causes its victims to be bom blind; and 
retinitis pigmentosa ("RP"), one of the most common forms of inherited blindness. RP is the name given to 
those inherited retinopathies which are characterized by loss of retinal photoreceptors (rods and cones), with 
retinal electrical responses to light flashes (i.e. etetroretowgrams, or tRGs") that are reduced in amplitude. 

The mechanism of retinal photoreceptor loss or cell death in different retinal degenerations is not fully 
underst 

The potypeptides encoded by the cDNAs of the invention can be used as cmmunogens to raise antibodies 
either polyclonal or monodonaL The antibodies can be used to detect the polypeptides fa diagnostic 
purposes, as therapeutic agents to inhibit the polypeptides, or as targeting moieties in immunotoxins. The 
production of monoclonal antibodies against a desired antigen is well known to those of skill in the art and is 
not reviewed in detail here. 

Those skilled in the art recognize that there are many methods for production and manipulation of various 
immunoglobulin molecules. As used herein, the terms "vnmunoglobufin" and "antibody" refer to a protein 
consisting of one or more polypeptides substantially encoded by immunoglobulin genes. Immunoglobulins 
may exist in a variety of forms besides antibodies, including for example, Fv, Fab. and F(ab)2, as wed as in 
single chains. To raise monoclonal antibodies, antfoody-producing cells obtained from immunized animals 
(e.g.. mice) are immortalized and screened, or screened first for the production of the desired antibody and 
then immortalized. For a discussion of general procedures of monoclonal antibody production see Harlow 
and Lane, Antibodies, A Laboratory Manual Cold 
Spring Harbor Pub&catrons, N.Y. (1988). 

The antibodies raised by these techniques can be used in immunodiagnostic assays to detect or quantify the 
expression of gene products from the nucleic acids disclosed here. For instance, labeled monoclonal 
antibodies to polypeptides of the invention can be used to detect expression levels in a biological sample. 
For a review of the general procedures in diagnostic immunoassays, see 
Basic and Clinical Immunology 7th Edition D. Stites and A Terr ed. (1991 ). 

The porvnucteotides of the invention are particularly useful for gene therapy techniques well known to those 
skilled in the art. Gene therapy as used herein refers to the multitude of techniques by which gene 
expression may be altered in cells. Such methods include, for instance, introduction of DNA encoding 
ribozymes or antisense nucleic acids to inhibit expression as well as introduction of functional wild-type 
genes to replace mutant genes (e.g., using wild-type GCAP genes to treat retinal degeneration). A number of 
suitable viral vectors are known. Such vectors induce retroviral vectors (see Miller, Curr. Top. MtcrobioL 
Immunol. 158: 1 

24 (1992); Salmons and Gunzburg, Human Gene Therapy 4: 129-141 (1993); Miller et at, Methods in 
Enzymotogy 217: 581-599. (1994)) and adano^associated vectors (reviewed in Carter, Curr. Opinion Biotech. 
3: 533-539 (1992); Muzcyzka.Curr. 

Top. Microbiol. Immunol 158: 97-129 (1992)). Other viral vectors that may be used within the methods 
include adenoviral vectors, herpes viral vectors and Sindbis viral vectors, as generally described in, e.g 
Jofly, Cancer Gene Therapy 1 :51 -64 (1994); 

Latch nan. MdecBtotechnot 2:179-195 (1994); andJohanrwg et al Nud Adds 
Res.23: 1495-1501 (1995). 
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Delivery of nucleic acids linked to a heterologous promoter-enhancer element via liposomes to also known 
(see, e.g., Brigham, et at. (1989) Am. J. Med 

Sci.. 298:278-281; Nabel, et al. (1990) Science, 249: 1285-1288; Hazinski, et al. 
(1991) Am. J. Resp. Cell Molec. Biol., 4:206-209; and Wang and Huang (1987) 

Proc Natl. Acad. Sci. (USA). 84:7851-7855); coupled to ligarxJ-specific, cation-based transport systems (Wu 
and Wu (1938)J. Biol. Chem., 263:14621-14624). Naked 

DNA expression vectors have also been described (Nabel et al. (1990), supra); Wolff et al. (1990) Science, 
247:1465-1468). 

The nucleic acids and encoded polypeptides of the invention can be used directedly to inhibit the 
endogenous genes or their gene products. For instance, 

Inhfoitory nucleic acids may be used to specifically bind to a complementary nucleic acid sequence. By 
binding to the appropriate target sequence, an RNA-RNA, a 

DNA-DNA, or RNA-ONA duplex is formed. These nucleic acids are often termed "anttsense" because they 
are usually complementary to the sense or coding strand of the gene, although approaches for use of 
"sense" nucleic acids have also been developed. The term "inhibitory nucleic actds" as used herein, refers to 
both "sense" and "antisense" nucleic acids. Inhibitory nucteic acid methods encompass a number of different 
approaches to altering expression of specific genes that operate by different mechanisms. 

In brief, inhibitory nudeic acid therapy approaches can be classified into those that target DNA sequences, 
those that target RNA sequences (including premRNA and mRNA), those that target proteins (sense strand 
approaches), and those that cause cleavage or chemical modification of the target nucteic acids (ribozymes). 

These different types of inhibitory nucleic acid technology are described, for instance, in Helene, C. and 
Toulme, J. (1990) Biochim. Biophys. Acta, 1049:99-125. 

Inhibitory nucleic acid complementary to regions of c-myc mRNA has been shown to inhibit c-myc protein 
expression in a human promyetocytic leukemia cetl bne,HL60, which overexpresses the c-myc protoncogene. 
See Wtekstrom EL, et al., (1988) 

PNAS (USA), 85:1028-1032 and Harel-Bellan, A., et al., (1988) Exp. Med., 168:2309-2318. 

The encoded polypeptides of the invention can also be used to design molecules (peptide or nonpeptidic) 
that inhibit the endogenous proteins by, for instance, inhibiting interaction between the protein and a second 
molecule specifically recognized by the protein. Methods for designing such molecules are well known to 
those skilled in the art 

For instance, polypeptides can be designed which have sequence identity with the encoded proteins or may 
comprise modifications (conservative or non-conservative) of the sequences. The modifcations can be 
selected, for example, to alter their in vivo stability. For instance, 'inclusion of one or more D-amino acids in 
the peptide typically increases stability, particularly if the D-amino acid residues are substituted at one or 
both termini of the peptide sequence. 

The polypeptides can also be modified by linkage to other molecules. 

For example, different N- or C -terminal groups may be introduced to alter the molecule's physical and/or 
chemical properties. Such alterations may be utilized to affect, for example, adhesion, stabflity, bio- 
availability, localization or detection of the molecules. For diagnostic purposes, a wide variety of labels may 
be linked to the terminus, which may provide, directly or indirectly, a detectable signal. Thus, the 
polypeptides may be modified in a variety of ways for a variety of end purposes while still retaining biological 
activity. 

EXAMPLES 

The following examples are offered to illustrate, but not to Km it the present invent ion. 
Example 1 

REGION 20q13 IN BREAST CANCER 
Patients and tumor material. 

Tumor samples were obtained from 1 52 women who underwent surgery for breast cancer between 1987 and 
1992 at the Tampere University or City Hospitals. 

One hundred and forty-two samples were from primary breast carcinomas and 1 1 from metastatic tumors. 
Specimens from both the primary tumor and a local metastasis were available from one patient Ten of the 
primary tumors that were either in situ or mucinous carcinomas were excluded from the material, since the 
specimens were considered inadequate for FISH studies. Of the remaining 132 primary tumors, 128 were 
invasion ductal and 4 lobular carcinomas. The age of the patients ranged from 29 to 92 years (mean 61). 
Clinical follow-up was available from 129 patients. Median follow-up period was 45 months (range 1.4-1.77 
months). 

Radiation therapy was given to 77 of the 129 patients (51 patients with positive and 26 with negative lymph 
nodes), and systemic adjuvant therapy to 36 patients (33 with endocrine and 3 with cytotoxic chemotherapy). 
Primary tumor size and axillary node involvement were determined according to the tumor-node metastasis 
(TNM) classification. The histopathological diagnosis was evaluated according to the World 
Health Organization (11). The carcinomas were graded on the basis of the tubular arrangement of cancer 
cells, nuclear atypia, and frequency of mitotic or hyperchromatic nuclear figures according to Bloom and 
Richardson. Br. J. Cancer, 11: 359-377 (1957). 

Surgical biopsy specimens were frozen at-70 C within 15 minutes of removal. Cryostat sections (5-6ym) 
were prepared for intraoperative histopathological diagnosis, and additional thin sections were cut for 
immunohistochemicaJ studies. One adjacent 200Nm thick section was cut for DNA flow cytometric and FISH 
studies. 

Cell preparation for FISH. 

After histological verification that the biopsy specimens contained a high proportion of tumor celts, nuclei 
were isolated from 200ssm frozen sections according to a modified Vindelov procedure for DNA flow 
cytometry, fixed and dropped on sides for FISH analysis as described by Hyytinen et al , Cytometry 16: 93- 
99 (1994). Foreskin fibroblasts were used as negative controls in application studies and were prepared by 
harvesting cells at conftuency to obtain© phase enriched interphase nuclei. AO samples were fixed in 
metrtanol-acetic-ecid (3:1). 

Probes. 

Five probes mapping to the 20q13 region were used ( see Stokke, et al, Genomics. 26: 134-137 (1995)). 
The probes tnduded Phdones for metar»oax1irv3-receptDf (probe MC3R, fractional length from p-arm 
telomere (Ftpter 0.81 ) and phosphoenofcyruvate carboxy kinase (PCK. Ftpter 0.84), as wel as anonymous 
cosmkJ clones RMC20C026 (Ftpter 0.79). In addition. RMC20C001 (Ftpter 0.825) and RMC20C030 (Flpter 
0.85) were used. Probe RMC20C001 was previously shown to define the region of maximum amplication 
fTanneretai., 
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Cancer Res, 54: 4257-4260 (1994)). One cosmid probe mapping to the proximal p-arm, RMC20C038 (FLpter 
0.237) was used as a chromosome-specific reference probe. Test probes were labeled with biotffv14-dATP 
and the reference probe witMigoxkjenin^-dUTP using nick translation (Kallioniemiet aL, Proc. Natl Acad Sci 
USA, 89: 5321-5325 (1992)). 

Fluorescence in situ hybridization. 

Two-color FISH was performed using biotin-labeted 20q!3-specific probes and dtgoxigenin-iabelled 20p 
reference probe essentially as describe^ Id.). 

Tumor samples werepostfixed in 4 ^parefc^atcjtheyo^rwsphate-ourTered saline for 5 min at 4 C prior to 
hybridization, dehydrated in 70%,85 % and 100% ethane-!, air dried, and incubated for 30 min at80"C. S fides 
were denatured in a 70% formamide/2x standard saline citrate solution at72-74"C for 3 min, followed by a 
proteinase K digestion (0.54g/ml). The hybridization mixture contained 18 ng of each of the labeled probes 
and 10tcg human placental DNA. After hybridization, the probes were detected immunochemically with 
avidin-FITC and anti-digoxlgenin 

Rhodamine. Slides were counterstained with 0 2.uM 4,6-diamkJino-2-phenyl indole (DAPI) in an antifade 
solution. 

Fhjorescencemicroscope and scoring of signals in Interphase nuclei. 

A Nikon fluorescence microscope equipped with double band-bass fitters (Chromatechnology, Bratiteboro, 
Vermont, USA) and 63 x objective (NA 1 .3) was used for simultaneous visualization of FITC and Rhodamine 
signals. At least 50 non-overlapping nuclei with intact morphology based on the DAPI counterstaining were 
scored to determine the number of test and reference probe hybridization signals. 

Leukocytes infiltrating the tumor were excluded from analysis. Control hybridizations to normal fibroblast 
trterphase nudei were done to ascertain that the probes recognized a single copy target and that the 
hybridization efficiencies of the test and reference probes were similar. 

The scoring results were expressed both as the mean number of hybridization signals per cell and as mean 
level of amplification (= mean of number of signals relative to the number of reference probe signals). 

DNA flow cytometry and steroid receptor analyses. 

DNA flow cytometry was performed from frozen 200,um sections as described by Kallioniemi.Cytometry 
9:164-169 (1988). Analysis was carried out using an EPICS C flow cytometer (Coulter Electronics Inc., 
Hiateah, Forida, USA) and the Multicycle program (Phoenix Flow Systems, San Diego, California. USA). 

DNA-index over 1.07 (in over 20% of cells) was used as a criterion for DNA aneuplotdy. In DNA aneuplokj 
histograms, the S -phase was analyzed only from the aneuploid clone. Cell cycle evaluation was successful 
in 86% (108/126) of the tumors. 

Estrogen (ER) and progesterone (PR) receptors were detectectimmunohistochemically. from cryostat 
sections as previously described (17). The 

staining results were semiquantitativery evaluated and a histoscore greater than or 
equal to 100 was considered positive for both ER and PR (17). 

Statistical Methods. 

Contingency tables were analyzed with Chi square test for trend 

Association between S -phase fraction (continuous variable) and 20q13 amplification 

was analyzed with KruskaJ-WaBis test Analysis of disease-free survival was performed using the BMDPIL 

program and Mautel-Cox test and Cox's proportional hazards model (BMDP2L program) was used in 

multivariate regression analysis (Dixon BMDPStadsdca! Software. London, Berkeley, Los Angeles: 

University of 

California Press. (1981)). 

Amplification of20g13 in primary breast carcinomas by fluorescencem srtuhybridizati. 

The minimal region probe RMC20C001 was used in FISH analysis to assess the 20q1 3 amplification, FISH 
was used to analyze both the total number of signals in individual tumor cells and to determine the mean 
level of amplification (mean copy number with the RMC20C001 probe relative to a 20p-reference probe). 

In addition, the distribution of the number of signals in the tumor nuclei was also assessed. Tumors were 
classified into three categories: no. low and high level of amplification. Tumors classified as not amplified 
showed less than 1.5 than 1.5 fold-copy number of the RMC20C001 as compared to the p-arm control 
Those classified as having low-level amplification had 1 .5-3-fold average level of amplification. Tumors 
showing over 3-fold average level of amplification were classified as highly amplified. 

The highly amplified tumors often showed extensive intratumor heterogeneity with up to 40 signals in 
individual tumor cells. In highly amplified tumors, the RMC20C001 probe signals were always arranged in 
dusters by FISH, which indicates location of the amplified DNA sequences in close proximity to one another 
e.g. in a tandem array. Low level 20q13 amplification was found in 29 of the 132 primary tumors(22%), 
whereas nine cases (6.8%) showed high level amplification. The overall prevalence of increased copy 
number in 20q1 3 was thus 29% (38/1 32). 

Defining theminimal region of amplification. 

The average copy number of four probes Hanking RMC20C001 was determined in the nine highly amplified 
tumors. The flanking probes tested were malarxx»rtin-3-receptor (MC3R, FLpter 0.81 ), 
phosphoenolpyruvate carboxykinase (PCK. 0.84), RMC20C026 (0.79) and RMC20C030 (0.85). The 
ampfcon size and location varied slightly from one tumor to another but RMC20C001 was the only probe 
consistently highly amplified in an nine cases. 

Association of20g13 amplification with pathological and biological features. 

The 20q1 3 amplification was significantly associated with high histologic grade of the tumors (p=0.01). This 
correlation was seen both in moderately and highly ampfified tumors (Table 4). Amplification oCOql 3 was 
also significantly associated with aneuptoidy as determined by DNA flow cytometry(p=0.01 , Table 4) The 
mean cefl proliferation activity, measured as the percentage of cells in the S-phase fraction, increased 
(p=0.0085 by KrusxaJ-Waflis test) with the level of amplification in tumors with no. low and high levels of 
amplification (Table 4). No association was found with the age of the patient primary tumor size. axiBary 
nodal or steroid homione-receptor status (Table 4). 

Table 4. CfoicopathoJogicaJ correlations of amplication at chromosomal regk>n209l3 in 132 primary breast 
cancers. 

Pathobblogic 20q13 ampffication status p-vatue 
NO LOW LEVEL HIGH LEVEL 
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Number of Number of Number of 
patt8nts{8) patients{8) patjents(W) 
AO primary 94 (71%) 29 (22%) 9 (6.8%) 
tumors 

Age of patients 

< 50 years 17 (65%) 6 (23%) 3 (12%) 

> 50 years 77 (73%) 23 (22%) 6 (5.7%) .39 
Tumor size 

< 2 cm 33 (79%) 7 (17%) 2 (4. 8%) 

# 2 cm 58 (67%) 22 (25%) 7 (8.0%) .16 
Nodal status 

Negative 49 (67%) 19<26W) 5 (6.8%) 

Positive 41 (75%) 10 (18%) 4 (7.3%) .41 

Histologic grade 

l-H72(76g) 18 (19%) 5 (5.3%) 

III 16(52%) 11 (35%) 4 (13%) 

.01 

Estrogen 
receptor status 

Negative 30 (67%) 10 (22%) 5(11%) 
Positive 59 (72%) 19 (23%) 4 (4.9%) 
.42 

Progesterone 
receptor status 

Negative 57 (69%) 20 (24%) 6 (7.2%) 
Positive 32 (74%) 8 (19%) 3 (7.0%) 
.53 

DNA ploidy 

Diploid 45 (82%) 8 (14.5%) 2 (3.6%) .01 

Aneuploid 44 (62%) 20 (28%) 7 (10%) 

S-phase fraction means ♦ SO mean ♦ SD mean + SD .0085' 

(%) 9.9® 7.2 12.6# 6.7 19.0# 10. 5@. Kruskal-Vvalfis Test. 

Relationship between 20q13 amplification and disease-free survival. 

Disease-free survival of patients with high-leveI20q13 amplification was significantly shorter than for patients 

with no or only low-level amplification (p-0.04). Disease-free survival of patients with moderately amplified 

tumors did not differ significantly from that of patients with no amplification. Among the 

node-negative patients (n=79), high level 20q1 3 amplification was a highly significant 

prognostic factor for shorter disease-free survfva!{p = 0.002), even in multivariate 

Cox's regression analysis (p=0.Q26) after adjustment for tumor size ER, PR grade, ploidy and S-phase 

fraction. 

20q13 amplificationm metastatic breast tumors 

Two of 1 1 metastatic breast tumors had low level and one high level 20q13 amplication. Thus, the overall 
prevalence (27%) of increased 20q13 copy number in metastatic tumors was a similar to that observed in the 
primary tumors. 

Both a primary and a metastatic tumor specimens were available from one of the patients. This 29-year old 
patient developed a pectoral muscle infiltrating metastasis eight months after total mastectomy. The patient 
did not receive adjuvant or radiation therapy after mastectomy. The majority of tumor celts in the primary 
tumor showed a low level amplification, afthough individual tumor cells (less than 5 % of total) contained 6-20 
copies per cell by FISH. In contrast, aP tumor cells from metastasis showed high level 20q13 amplification 
(12-50 copies per cell). The absolute copy number of the reference probe remained the same suggesting 
that high level amplification was not a result of an increased degree of aneuploidy. 

Diagnostic and Prognostic Value of the2on13 Ampfification 

The present findings suggest that the newfy-discovered 20ql3 amplification may be an important component 
of the genetic progression pathway of certaki breast carcinomas. Specifically, the foregoing experiments 
establish that 1) 

High-level 20q13 amplification, detected in 7% of the tumors, was srgrtificant/y associated with decreased 
disease-free survival in node-negative breast cancer patients, as well as with indirect indicators of high- 
malignant potential, such as high grade and 

S-phase fraction. 2) Low-level amplification, which was much more common, was also associated with 
cGnicopathological features of aggressive tumors, but was not prognosttcally significant 3) The level of 
amplification of RMC20C001 remains higher than amplification of nearby candidate genes and loci indicating 
that a novel oncogene is located in the vicinity of RMC20C001. 

High-tevel20q1 3 amplification was defined by the presence of more than 3-fold higher copy number of the 
20q1 3 amplification is somewhat lower than the amplification frequencies reported for some of the other 
breast cancer oncogenes, such asERBB2 (17q12) and Cycltn-D(llq13) (Borg et a!., Oncogene. 6: 137-143 
(1991), Van de Vyveret al. Adv. Cane. Res., 61: 25-56 (1993)). However, similar to what has been previously 
found with these other oncogenes (Swab, et al., Genes 

Chrom. Canc.,1:181-193 (1990), Borgetal., supra.), higrntevel 20q1 3 amplification was more common in 
tumors with high grade or high S-phase fraction and in cases with poor prognosis. Although only a small 
number of node-negative patients was analyzed, our results suggest that 20q13 amplification might have 
independent role as a prognostic indicator. Studies to address this question in targe patient materials are 
warranted. Moreover, based on these survival correlations, the currently unknown, putative oncogene 
amplified in this locus may confer en aggressive phenotype. Thus, doning of this gene is an important goal. 
Based on the association of amplification with highly proliferative tumors one could hypothesize a role for this 
gene in the growth regulation of the ceD. 

The role of the low-level 20q13 amplification as a significant event in tumor progression appears less dear. 
Low-level amplification was defined as 1 5-3-fotd increased average copy number of the 20q13 probe 
relative to the p-arm control. In addition, these tumors characteristically tacked individual tumor celts with 
very high copy numbers, and showed a scattered, not clustered, appearance of the signals. Accurate 
distinction between high and low level 20qD amplification can only be reliably done by FISH, whereas 
Southern and slot blot analyses are Iftety to be able to detect only high-level ampfification, in which 
substantial elevation of the average gene copy number takes place. This distinction is important because 
only the high amplified tumors were associated with adverse clinical outcome. Tumors with low-tevel 20qJ3 
amplification appeared to have many cthiccpathotogicaJ features that were in between of those found for 
tumors with no and those wfth high level amp lification. For example, the average tumor S-phase fraction was 
lowest in the non-amplified tumors and highest in the highly amplified tumors. One possibility is that low-level 
amplification precedes the development of high level amplification. 

This has been shown to be the case, e.g., in the development of drug resistance-gene amplification in vitro 
(Stark, Adv. Cane Res.. 61: 87-113 (1993)). Evidence 

supporting this hypothesis was found in one of our patients, whose local metastasis 
contained a much higher level of 20qt3 a mp fifi ca t ion than the primary tumor operated 
8 months earlier. 
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Finally, our previous paper reported a 1.5 Mb critical region defined by RMC20C001 probe and exclusion of 
candidate genes in breast cancer coll lines and in a limited number of primary breast tumors. Results of the 
present study confirm these findings by showing conclusively in a larger set of primary tumors that the critical 
region of amplification is indeed defined by this probe. 

The present data thus suggest that the high-level 20q13 amplification may be a significant step in the 
progression of certain breast tumors to a more malignant phenotype. The clinical and prognostic implications 
of 20q13 amplification are striking and location of the minimal region of amplification at 20q13 has now been 
defined. 

It is understood that the examples and embodiments described herein are for illustrative purposes only and 
that various modifications or changes in Eight thereof will be suggested to persons skilled in the art and are to 
be included within the spirit and purview of this application and scope of the appended claims. AD 
publications, patents, and patent applications cited herein are hereby incorporated by reference for all 
purposes. 

Discussion of the Accompany ingSequence Listing 

SEQ ID NOs: 1-10 and 12-13 provide nucleic acid sequences, tn each case, the information is presented as 
a DNA sequence. One of skill will readily understand that the sequence also describes the corresponding 
RNAfLe., by substitution of the T residues with U residues) and a variety of conservatively modified 
variations thereof. The complementary sequence is fully described by comparison to the existing 
sequence, ie., the complementary sequence is obtained by using standard base pairing rules for DNA (e.g., 
A to T, C to G). In addition, the nucleic acid sequence provides the corresponding amino acid sequence by 
translating the given DNA sequence using the genetic code. 

For SEQ ID NO 1 1, the information is presented as a polypeptide sequence. One of skill will readily 
understand that the sequence also describes all of the corresponding RNA and DNA sequences which 
encode the polypeptide, by conversion of the amino acid sequence into the corresponding nucleotide 
sequence usrig the genetic code, by alternately assigning each possible codon in each possible cod on 
position. Sim laity, each nucleic acid sequence which is provided also inherently provides all of the nucleic 
acids which encode the same protein, since one of skill simply translates a selected nucleic acid into a 
protein and then uses the genetic code to reverse translate all possible nucleic acids from the amino acid 
sequence. 

The sequences also provide a variety of conservatively modified variations by substituting appropriate 
residues with the exemplar conservative amino acid substitutions provided, e.g., in the Definitions section 
above. 

SEQUENCE LISTING 
SEQ. ID. No. 1 3bf4 3000bp 

CCGCCGGCCGGGGCGGGTGGCTGCACTCAGCGCCGGAGCCGGGAGCTAGCGGCCGCCGCCATGTCCCACCAGACCGGCAT 
CCAAGCAAGTGAAGATGTTAAAGAGATCTTTGC CAGAGC CAGAAATGGAAAGTACAGACTTCTGAAAATATCTATTGAAA 
ATGAGCAACTTGTGATTGGATCATATAGTCAGCCTTCAGATTCCTGG 

TTGGAGGACAAACAACCATG CTATATATTATTC AGGTTAGATTCTC AGAATGC CCAGGGATATGAATGGATATTCATTGC 

ATGGTCTCCAGATCATTCTCATGTTCGTCAAAAAATGT^ 

CTGGCCACATTAAAGATGAAGTATTTGGMCAGTAAAGGAAGATGTAT 

CAATCTTCCCCTGCCCCACTGACTGCAGCTGAGGAAGAACTACGACAGATTAAAATCAATGAGGTACAGACTGACGTGGG 

TGTGGACACTAAGCATCAAACACTACMGGAGTAGCAmCCCAT^ 

ATAATAGACAGCTCAACTATGTGCAGTTGX3AAATAGATATAAA 

GAACTGAAAGATTTGCCAAAGAGGATTC C CAAGGATTCAGCTCGTTACCATTTCTTTCTGTATAAACATTCCCATGAAGG 
AGACTATTTAGAGTCCATAGTTTTTATTTATTCAATGCCTGGATACACATGCAGTATAAGAG^ 

GCTGCAAGAGCCGTCTGCTAGAAATTGTAGAAAGACAACTACAAATGGATGTAATTAGAAAGATCGAGATAGACAATGGG 

GATGAGTTGACTGCAGACTTC C TTTATGAAGAAGTAC ATC C C AAGC AG(^TGCACACAAGCAAAGTTTTGC AAAAC C AAA 

AGGTCCTGCAGGAAAAAGAGGAA1TCGAAGACTAATTAGGGGCCCAGCGGAMTCGAAGCTACTACTGATTAAAGTCATC 

ACATTAAACATTGTAATACTAGTTTTTTAAAAGTCCAGC 

AGTAGGGAAAAAAATTGTACTTTTTGGAAMTAGCACTTTTC 

CATGATTTCTATTTTTGCGTTAAAGCTAGAAAAGGGTTCAACATAATGTTTAA 

ATTCCACACTTCAMTACTTCTTAAMTTTTATACA 

ACTACTTTCTTGTGGGACAGAAAGACCTTAAAATATTCATATTACTTAATGAATATGTTAAGGAC 

TCTAAGCTGGAAAC TTAGTGTGC C TTGGAAAAGC C GCAAGTTGCTTATCTC GAGT AGCTGTGCTAGCTCTGTC AGAC TGT 

AGGATCATGTCTGCAACTTTTAGAAATAGTGCTTTATATTGCAGCAGTCTTTTATATTTGAC 1 1 1 1 1 1 1 1 AATAGCATTA 

AAATTGCAGATCAGCTCACTCTGAAACTTTAAGGGTACCAGATATTTTCTA 

TAC TG C AGGATTTCTGATGAC ATTGAAAG 

ACTT 

TTGATAAAAGTCAAGTGC GAACAGAAC CTC C C AAGGAAAAGAATTGC AAGGAAAATGAATTTAGC TGTGAGGTATGTGG G 
WGAWmAGAGTCGCTTTTGATGTTGAGATCCA 

CGGAAGAAGATTCAAGGAGCCTTGGTTTCTTAAAAATCACATGCGGACRCATAATGGCAAATCGGGGGCCAGAAGC 

TGC AGCAAGGCTTGGAGAGTAGTC CAGCAACGATCAAC GAGGTC GTC CAGGTGCAC GC GGC C GAGAGC ATC TC C TCTCC T 

TGCAAAATCTGCATGGTTTGTGGCTTCCTATTTCCAAATAAAGAAAGTCTAATTGAGCACCGCAAGGTGCACACCAAAAA 

AACTG CTTTCGGTACCAGCAGCGCGCAGACAGACTCTCCACAAGGAGGAATGC C GTCCTCGAGGGAGGACTTCCTGCAGT 

TGTTCAACTTGAGACCAAAATCTCACCCTGAAACGGGGAAG^ 

ACCTTCCAGGCTTGGCAKCTGGCTACCAAAGGAAWAGTTGCCATTTGCCAAGAAGTGAAGGM 

CAC C GAC AAC GAC GATTC GAGTTC CGAGAAGGAGCTTGGAGAAAC AAATAAGAACC ATTGTG C AGGC C TCTC G CAAGAGA 
AAGAGAAGTGCAAACACTCCCACGGCGAAGCGCCCTCCGTGGACGCGGATCCCAAGTTACCCAGTAGCAAGGAGAAGCCC 
ACTCACTGCTCCGAGTGCGGCAAAGCmCAGAACCTACCACCAGCTGGTCTTGCACTCCAGGGTCC 
SEQ. ID. No. 4 cc43 2605 bp 

CAAGCTCGAAATTAACCCTCACTAAAGGGAACAAAAGCTGGAGCTCCACCGCGGTGGCGGCCGCTCTAGAACTAGTGX^T 

CCCCCGGGCTGCAGGAAnCGGCACGAGCTGGGCTACTACGATGGCGATGAGTTTCGAGTGGCCGTGGCAGTAT 

CC^CCCTTCTTTACGTTACAACCGAATGTGGACACTCGGCAGAAGCAGCTGGCCGCCTGGTGCTCGCTGGTCCTGTCCTT 

CTGCCGCCTGCAC AAACAGTC CAGCATGACGGTGATGGAAGCTCAGGAGAGCCC GCTCTTCAACAACGTCAAGCTACAGC 

GAMGCTTCCTGTGGAGTCGATCCAGATTGTATTAGAGGAACTGAGGAAGAAAGGGAACCTCGAGTGGTTGGATM 

GAACAACTCCGTCTTrACCCTGTATGAACTGACTAATC 

CCACTCTACTGCGGGCTCTGCAGGCCCTACAGCAGGAGCACAAGGCCGAGATCATCACTGTCAGCGATGGCCGAGGCGTC 

AAGTTCTTC TAGCAGGGAC CTGTCTCC CTTTACTTCTTAC CTCC C AC CTTTC CAGGGC TTTC AAAAGGAGACAGACC CAG 

TGTCCCCCAAAGACTGGATCTGTGACTCWCCAGACTCAAAAGGACTCCAGTCCTGAAGGCTGG^ 

TCTCACACCCCATATGTCTGTCCCTTGGATAGG GTGAGGCTGAAGCACC^G^K^GAAAATATGTGCTTCTTCTCGC CCTA 

CCTCCTTTCCCATCCTAGACTGTCCTTGAGC^ 

TACACACATGCGCCTGCAGCACATGCTTCTGTCTCCT^ 

TGGTGCTGGATCCTTCCTAGGGGATGGGGGAAGCCCTGGCTGC^^ 

CGGGCCTGGCAGTGAGAGGTGTGGCCCCAfJACCGATTTATGATATTAAAATCTCAACTCCC 

CTGAGACTAGTTCTCTCTCTCTCGAGAACT AGTCTCGA G I 1 1 M I I I 1 1 1 1 1 1 1 1 1 1 1 1 I H 1 1 1 111 I H 1 1 IG 

CAGAGGGAAAAAAAAAAGACCATGAATCTTCCTC 

CTGAAAAAAGTTGACTGAAACTCCAAACCAACATGC 

AGATAAAGAACAGTCTCAAGI 1 1 1 1 GTACAGC CTACACATAGTACAAGGGTCC CCTATGATGATTCTTCTGTAGGAC GAA 

ATAATGTAATTTTTrCAGTTTCTGGTTTATAACTC 

GAGAATAAAGCACTCATATTTTTATAAATTATATGG^ 

TGGACTAAAGtCAATAATTATTTTATTCTCAATGTCTGTGCTAACCTCAATGACTTAGAATGCTT^ 

TATGCCTCAACGACACTGGCTnCTTTTAGCTCTTGAACAAGCCA 

AACTGCTTCCTGCCTCAGGACC^ 

TAGTTTTATCCAACACTTCAGATCXTGCCGTAAAAACTCTTCTTATAGAAGC 
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CATACTCAC CAGCA C^CATGTAGACTAGATTAGAACCTCCTGTTTTTCTTTTTCATACTTTTCTCTATCAT^ 
CATTATAATATTTTTATTATGTGTGTGAATGTCTGCCCCMGTCAGTTTCCTCACTAAACTATAAACTCCGTAAAGCTGG 
GATCCTTCCAATTTTGATCACCACTTAGTACAGTAGGAACACAGTAAAGATTCAATTGGTATTTGTGGAATGA^ 
MTTGTTTTGCTAGTAAAGTCTGGGGGAACCCAGGTGAGAAGAGCCTAGAAAGCAGGTCGAATCCAAGGCTAGATAGACT 
TAGTGTTACTCAAGAAAGGGTAGCCTGAAAATAAAGGTTCAAATTATAGTCAAGAATAGTCAAGACATGGGCAAGACAAG 
AGTGCTGCTCGTGCC GAATTCGATATC AAGCTTATCGATACCGTCGACCTC GAGGGGGGGCCCGGTACCC AATTCGCC CT 
ATAGTGAGTCGTATTACAATTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAAT 
SEQ. ID. No. 5 41.1 1288 bp 

GAGGGCAGCGAGAAGGAGAAACCCCAGCCCCTGGAGCCCACATCTGCTCTGAGCAATGGGTGCGCCCTCGCCAACCACGC 
CCCGGCCCTGCCATGCATCAACCCACTCAGCGCCCTGCAGTCCGTCCTGAACAATCACTTGGGCAAAGCCACGGAGCCCT 
TGCGCTCACCTTCCTGCTC CAGCCCAAGTTCAAGCAC AATTTC CATGTTCCAC AAGTCGAATCTCAATGTCATGGACAAC 
CCGGTCTTGAGTCCTGCCTCCACAAGGTCAGCCAGCGTGTCCAGGCGCTACCTGTTTGAGAACAGCGATCAGCCCATTGA 
CCTGACCAAGTCCAAAAGC AAGAAAGCC GAGTCCTCGCAAGCACAATC TTGTATGTC C C CACCTCAGAAGC ACGCTCTGT 

CTGACATCGCCGACATGGTCAAAGTCCTCCCCAMGCCACCACCCCAMGCCAGCCTCCTCCTCCAGGGTCCCCCCCATGMGCTGGAAATGGATGTCAGGCGCm 
CAACTGGAATC CTCAGCATCTTCTGATTCTAC AAGC C CACTTTGCCTCGAGCCTCTTCCAGAC ATCAGAGGGCAAATACC 

TGCTGTCTGATCTGGGCCCACMGAGCGTATGCAAATCTCTMGTTTACGGGACTCTCAATGACCACTATCAGTCACTGG 

CTGGCCAACGTCAAGTACCAGCTTAGGAAAACGGGCGGGACAAMTTTCTGAAAAACATGGACAAAGGCCACCCCATCTT 

TTATTGCAGTGACTGTGCCTCCCAGTTCAGAACCCCTTCTACCTACATCAGTCACTTAGAATCTCACCTGGGTTTCCAAA 

TGAAGGACATGACCCGCTTGTCAGTGGACCAGCAAAGCAAGGTGGAGCAAGAGATCTCCCGGGTATCGTCGGCTCAGAGG 

TCTC C AGAAACAATAGCTGCC GAAGA G GAC AC AGACTCTAAATTCAAGTGTAAGTTGTGC TGTC GGAC ATTTGTGAGC AA 

ACATGCGGTAAAACTCCACCTAAGCAAAACGCACAGCAAGTCACCCGAACACCATTCACAGTTTGTAACAGACGTGGATG 

AAGAATAGCTCTGCAGGACGAATGCCTTAGTTTCCACTTTCCAGCCTGGATCCCCTCACACTGAACCCTTCTTCGTTGCA 

CCATCCTGCTTCTGAC ATTGAACTCATTGAACTCCTC CTGACACCCTGGCTCTGAGAAGACTGC CAAAAAAAAAAAAAAA 

AAAAATTC 

SEQ. ID. No. 6 

GCAP 2820 

bpATCCTAAGACGCACAGCCTGGGAAGCCAGCACTGGGGAAGTTGGGTGCTGAGGGATGTGGGTCACTGGGGTGAAGGTGGAGC 

TTTCAGGGTCTCCCGTCAATGCAGCTGAGTTTCTTTGGCAGGGAATTTACCAGCTGAAGAAAGCCTGCCGGCGAGAGCT 

AC AAACTGAGCAAGGC CAGCTGCTCACACC CGACCAGGTC GTGGACAGGATCTTCCTC CTGGTGGATGAGAATGGAGATG 

GTAAGAGGGGCAGAGATGGGGAGAGTGCTGTCCACTCTGCATCATCGCCACTTTCTGGCCGCACGTCCTTGGGCAAGGCC 

CTCCACCTTCCAACCCTGGGGTCCTCATCTGTGAGAAGGCTGTGGAGAAGATGTCATGAACTAACAAAGGGACTCATGAG 

CACGTGmGTAGGAGTGACTAAMGTCCTACAGG^GTTGCTGATGGAGGCCAGGCACGCAGAATAGAAAGAATAGGAAC 

TTTGGAGTCAGGCAGGGAGTGATATATTGAGCTTCTCGTCCTAGTCTCAATTTCCTCATCTGGAAAATGGGGATAATAAT 

AGTGGTTGAGAGGAATGMTAGGATAATGTGTTTMGAGCAGGCATAGGGTAGACCTCCATTCAGGCTGCTTGGGCTTTC 

CTCCCTGTAGCCCAAAGCCCAGCCTCAGGGCTATGTGGGGAGAGAGCTGGCTTGGAATACACACTTGAGCCCTCCAGCTC 

TCTCAGCTC CACC CAG WTTTC CGTGGTACCATGC GC AAAAGTAAAACTTCAATTCATCAGCAAAGAAAGC CCCTTAAAG 

GTGGCAGGAGACTCCTGGAGATTCAGACACCCTGACAAGCCGCAAGCTTGAGGTCTGAGACTGCAGGATCGTTGGCATAAG 

ACGTGTAGGCGCATCCTGGGAGCGAGGTCTCTCCTCCTGCCC CCAGAC CC AGGTCTC CC CTTCTTCTACATGACCACCTC 

TCCTCCCCCTTGCTCAGGCCAGCTGTCTCTGMCCGAGTTTTGTTGAAGGTGCCCGTCGGGACAAGTGGGTGATGAAGATGC 

TGCAGATGGACATGAATC CCAGCAGCTGGCTCGCTCAGCAGAGAC GGAAAAGTGCC ATGTTCTGAAGGAGTCTGGGGC CCC 

TCCACGACTCCAGGCTCACCCAGGTTTCCAGGGTAGTAGGAGGGTCCCCTGGCTCAGCCTGCTCATGCCCACTCTTCCCC 

TGGTGTTGAC TTCC TGGCACCCCCTGTGCAGGGCTGAGTGGGGATGGGGAAGGGCTGCTGGGTTTGAAGTGGCCAACAGG 

GCATAGTCCATmGGAGGAGTCCCTGGGATGGTGAAGGGAAnC^GmCTTrTCCTGTTCAGCCGCTCCTGGGAGGAC 

TGTGCCTTGGCTGGGTGGTTGTGGGGCTCCCACAGTTTCTGGGTGTTCTCAGTTGGAAGCAAGAAGCCAACTGAGGGGTGA 

GGGTCCCACAGACCAAATCAGAAATGAGAACACAAAGACTGGTAGGAGGCAGGGGTTGGGAGGGTGTTGAGACTGAAGAAA 

AGGCAGGAGTTGCCGGGCACGGTGG CTCACGCCTGTAATCCCAGCACTTTGGGAGGC CGAGGCGGGCAGATCACGAGGTC 

AGGAGATC GAGAC CATCCTGGCTAAC ACGGGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATCAGCCGGGTGAGGTG 
GCGGGCGCCTGTAGTCCCAGCTACTCAGGAGGCTGAGGCAAGAGAATGGCGTGAACCCCAGGGGGCCGAGCCTACAGTGA 
GC C GAGATTG C GC CACTGCACTCC AGCCTGGACGAC AGTGAGACTCC GTCTCAAAAAAAAAAAAAGAAAGAAAAGAAAAG 
GC GGAGTTTTGGG GGGC AGGGGGCAGCAATC CTTCTATAAGTTCC GGGATGCCTGAGGGGCGTTCATGGGGAGGAC C CTG 
GC CTCCTCCTCCCCAAGGCATC CTCACCAGTGGTGTCAAC AGGAAAAATGGCAGCAAATAC GCTGCAGGCTGT GGTCTTT 
CTGCCTTTGAAAGGGTCAGCTGTAC TTAAAGGGACTGTTTCAGCTC TGCCTGGGTGCTGCTCTGGGACCCCCTGCTGC CA 
ACCCACCACTCCCCCAACAATC CTCTCTTTCCATCCATATCCCCCGTATGGAC CTTTTC CACAACTCCCAGC CATAAGCTG 
AATGTTTCTCTTTAAAGGATGGAGAAAACTTCTGTCTGTCTCTGGCMGAATT 

TGGGCTTGGCTTCTMCTGCTGTGTGACCCAAGACAGCCACTTCTCCTCCCTMCCTTTGGTTATGTCTTGGCAGCACAGT 
GTGTAACCCCAATATAGAAACTAGATTAAAAGGGAGTCTCTCTGGTTGAAAGGGGAGCTGAGTACCCTCTGGAACTGGAG 
GC ACCTCTGAAAAAAGCAAACTGAAAAC CAGTGCC CTGGGTCACTGTTACTCCTATAAGACAGTTTAAAGTGAGACCTGG 
AAAAACATTTTGCTTTAC CTTGAATAGATAGGTTTTTATGTTGGTATATAAGAAATAAAACTAACCTATTAACC CTGAGAC 
TTTACAGGTGTGTTAGGGCATATGATAGTCATATAAAATTCCTTTAGACATCAATTTTAGGTAAAAAATAATT 

AAAAATATTGGCCAGGTGCAGCAGCTCACACCTGCAATCCCAGGACTTTGGGAGGCCGAGGCGGGTGGATCACCTGAGGT 
CAGG GGTTCAAGACCAGCCTG 
SEQ. ID. No. 7 1b4 1205 bp 

GC GCGCGTGAGTCCGCCCCC C CAGTC ACGTGACC GCTGACTCGGGGCGTTCTCCACTATCGCTTACCTAC CTGGGTCTGC 
AGGAACCCGGCGATATGGCTGCCGCTGTGCCCCGCGCCGCATTTCTCTCCCCGCTGCTTCCCTTCTCCTGGGCTTCCTGC 
TCCTCTC CGCTCC GCATGGCGGCAGCGGCCTGCACACCAAGGGGC GCCCTTTCCCCTGGATACGGTCACTTTCTACAAGGTCA 
TTC C C AAAAGC AAGTTTC GTCTGGTGAAGTTC GACAGGGAGTACC CC TAC GGTGAGAAG CAGGATGAGTTC AAGCGTCTTC 
TGAAAACTCGGCTTCCAGCGATGATCTCTTGGTGGCAGAGGTGGGGATCTCAGATTATGTGACAAGCTGAACATGGAGCT 
GAGTGAGAAATACAAGCTGGACAAAGAGAGCTAC CCATCTTCTACCTCTTCCGGGATGGGGACT7TGAGAACCC AGTC C C 
ATACACTGGGGCAGTTAGGTTGGAGCCATCCAGCGCTGGCTGAAGGGGCAAGGGGTCTACCTAGGTATGCCTGGTGCCTG 

CCTGTATACGACGAAATGGCCGGGGAGTTCATCAGGGCCTCTGGTGTGGAGGCCGCCAGGCCCTCTTGAAGCAGGGGCAA 
GGGGAGCACTTC CAGCATCAGAGATGACACGGATCGC CAGGCTGATTGAGAAGAAC AAGATGAGTGAC G GCAGAAGGAGG 
AGCTCCAGAAGAGCTTAAACATCCTGACTGCCTTCCAGAAGAAGGGGGCCGAGAAAGAGGAGCTGTAAAAGGCTGTCTG 

TGATTTTCCAGGGTTTGGTGGGGGTAGGGAGGGGANAGTTAACCTGCT 
<RT1 

MSKGGGAAAAAAGVVGGTACTMCCCACCCAGCAGTMCATCMTCCTTCCATC 

CTAGTCAAGAAC GATGGTAAAAATCAGGACACTGAAGATbCACTATrAACC GCTGACAGTGCGCAAACCAkAAATr 
TGA 

AAAGATTTTTTGATGGTGCCAAAGATGTTCAGGCAGTCCACCTGCAAAGCAGCTTAAGGAGATGCC 
TT 

TCAGAATGTTCTGGGWGCGCTGTCCTCTCACCAGCACACAAAGATACTCAGGGATTTCCATAA^ 
AT 

GAC AGTGCTGATAAAGTGAATAAAAAC C CTAC CC CTGCTTACCTGGAC CTGTTAAAAAAGAGATCAG CAGTTGA 
AA 

GAGGCTATGACC TTCCCV^GTACCATATGGTCAGAGGC ATCACATCACTGTTAC CGCAGGACTGTGTGTATCC 
GTC 

CCTAACAAAGCTGAAACAAAAAAGGACCCAGAAGACACGGGTGCTGAAAAGTC^ 
TTA 

AGTCAGAC AAAGC C AAC TTTACATCC C AGGAGACC CAAGGGGCTGGCAAGAATTCCAAAGGATGCAAC C CATC 
GGG ojusfy^seq 

TGTGATATTGATTCATGCCCTCTTGCACCTTGCCAAACATCACACGCTTG 
C CATC CAGTCC ACTC GATTTTGGC AGTGCAGATGAAAAACTGGGAACCAT 
TTGTGTTGAGTCCAGCAAGATGCCAGGACCTGCATGTTTCAGAACGAAGT 
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TCTTCATCATCCAATTTCTCCCTGTATATGGGCTTACCACNACTGCCGTT 

AAGTC GTGTNAAGTCAC C ACTC AGGTACATAATG GAATAATTCTGC AAA G 

GCAGGAGNCACTTTCTCTCCAGTGCTCAGACCATGAAAGTTTTCTGATGT 

CTTTGGAACTTTGTCTGCAAATAGCTCGAAGGAGACATGGCCTAAAGGCT 

CGCCATCTGCGGTGATATTGNAACATGGTAGGGCTGACCGTGGCTGTGGC 

CATGACT 

Query: 

Sbjct: 

EMI72.1 

Score = 177 (48.9 bits), Expect =1.5e-58, Sum P(5) = 1.5e-58 

Identities = 41/43 (85%), Positives = 41/48 (85%), Strand = Minus / Plus 

Query: 

Sbjct: 

EMI72.2 

Score = 154 (42.6 bits), Expect =1.5e-58. Sum P(5) = 1.5e-58 

Identities = 34/38 (89%). Positives = 34/38 (89%), Strand = Minus / Plus 

Query; 

Sbjct: 

EMI72.3 

Score = 86 (23.8 bits), Expect = 1.5e-58. Sum P(5) = 1.5e-58 

Identities = 22/28 (78%), Positives = 22/28 (78%), Strand = Minus / Plus 

Query: 

Sbjct: 

EMI72.4 
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GENES FROM 20ql3 AMPLICON AND THEIR USES 

Claims of corresponding document WO9802539 i Translate filstext 

WHAT IS CLAIMED IS: 1 . An isolated nucleic acid molecule comprising a polynucleotide sequence having 
a subsequence which specifically hybridizes under stringent conditions to a sequence selected from the 
group consisting of SEQ. ID. No. 2, SEQ. ID. No. 

3, SEQ. ID. No. 4, SEQ. ID. No. 5, SEQ. ID.No. 6, SEQ. ID. No. 7, SEQ. 

ID. No. 8, SEQ. ID. No. 9, SEQ. ID. No. 10, SEQ. ID. No. 12, AND SEQ. 

ID. No. 13. 

1. The isolated nucleic acid of claim 1, 
conditions to SEQ. ID. No. 

2, 

2. The isolated nucleic acid of claim 2, 

3. The isolated nucleic acid of claim 1, 

4. The isolated nucleic acid of claim 4, 

5. The isolated nucleic acid of claim 1, 
conditions to SEQ, ID. No. 

4. 

6. The isolated nucleic acid of claim 6, 

7. The isolated nucleic acid of daim 1 , 
conditions to SEQ. ID. No. 

5. 

8. The isolated nucleic acid of claim 8, 

9. The isolated nucleic acid of claim 1 , 
conditions to SEQ. ID. No. 

6. 

10. The isolated nucleic acid of claim 10, wherein the subsequence is SEQ. ID. No. 6. 

11. The isolated nucleic acid of claim 1, wherein the subsequence specifically hybridizes under stringent 
conditions to SEQ. ID. No. 

7. 

12. The isolated nucleic acid of claim 12, wherein the subsequence is SEQ. ID. No. 7. 

13. The isolated nucleic add of daim 1, wherein the subsequence specifically hybridizes under stringent 
conditions to SEQ. ID. No. 

8. 

14. The isolated nudeic add of daim 14, 16, 18, 20, wherein the subsequence is SEQ. ID. No. 8. 

15. The isolated nudeic add of daim 1, wherein the subsequence spedfically hybridizes under stringent 
conditions to SEQ. ID. No. 

9. 

16. The isolated nudeic add of daim 16, wherein the subsequence is SEQ. ID. No. 9. 

17. The isolated nudeic acid of daim 1 , wherein the subsequence specifically hybridizes under stringent 
conditions to SEQ. ID. No. 

10. 

18. The isolated nudeic add of daim 18, wherein the subsequence is SEQ. ID. No. 10. 

19. The isolated nudeic add of daim 1, wherein the subsequence specifically hybridizes under stringent 
conditions to SEQ. ID. No. 

12. 



wherein the subsequence spedfically hybridizes under stringent 

wherein the subsequence is SEQ. ID. No. 2. 

wherein the subsequence spedfically hybridizes toSEQ. ID. No. 3. 

wherein the polynucleotide is SEQ. ID. No. 3. 

wherein the subsequence spedfically hybridizes under stringent 

wherein the subsequence is SEQ. ID. No. 4. 

wherein the subsequence spedfically hybridizes under stringent 

wherein the subsequence is SEQ. ID. No. 5. 

wherein the subsequence spedfically hybridizes under stringent 
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20. The isolated nucleic acid of daim 20, wherein the subsequence is SEQ. ID. No. 12. 

21. The isolated nucleic acid of claim 1 , wherein the subsequence specifically hybridizes under stringent 
conditions to SEQ. ID. No. 

13. 

22. The isolated nucleic acid of claim 22, wherein the subsequence is SEQ. ID. No. 12. 

23. The isolated nucleic acid of claim 1, further comprising a promoter sequence operably linked to the 
polynucleotide sequence. 

24. The isolated nucleic acid of claim 1, which nucleic acid is a cDNA molecule. 

25. A method of screening for neoplastic cells in a sample, the method comprising: 

contacting a nucleic acid sample from a human patient with a probe which hybridizes selectively to a target 
polynucleotide sequence comprising a sequence selected from the group consisting of SEQ. ID. No. 1, 
SEQ. ID. No. 

2, SEQ. ID. No. 3, SEQ. ID. No. 4, SEQ. ID. No. 5, SEQ. ID. No. 6, SEQ. 
ID. No. 7, SEQ. ID. No. 8, SEQ. ID. No. 9, SEQ. ID, No. 10, SEQ. ID. No. 

11, SEQ. ID. No. 12, and, SEQ. ID. No. 13 wherein the probe is contacted with the sample under conditions 
in which the probe hybridizes selectively with the target polynucleotide sequence to form a stable 
hybridization complex; and 
detecting the formation of a hybridization complex. 

26. The method of claim 26, wherein the nucleic acid sample is from a patient with breast cancer 

27. The method of claim 26, wherein the nucleic acid sample is a metaphase spread or a interphase 
nucleus. 

28. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth .in SEQ. 
ID. No. 1. 

29. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth in SEQ. 
ID. No. 2. 

30. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth in SEQ. 
ID. No. 3. 

31. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth in SEQ. 
ID. No. 4. 



32. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth in SEQ. 
ID. No. 5. 

33. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth in SEQ. 
ID. No. 6. 

34. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth in SEQ. 
ID. No. 7. 

35. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth in SEQ. 
ID. No. 8. 

36. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth in SEQ. 
ID. No. 9. 

37. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth in SEQ. 
ID. No. 10. 

38. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth in SEQ. 
ID, No. 12. 

39. The method of claim 26, wherein the probe comprises a polynucleotide sequence as set forth in SEQ. 
ID. No. 13. 

40. The method of claim 26, wherein the probe is used to identify the presence of a mutation in the target 
polynucleotide sequence. 

41. A method for detecting a neoplastic cell in a biological sample, the method comprising: 
contacting the sample with an antibody that specifically binds a polypeptide antigen encoded by a 
polynucleotide sequence comprising a sequence selected from the group consisting of SEQ. ID. No. I, 
SEQ. ID. No. 2, SEQ. 

ID. No. 3, SEQ. ID. No. 4, SEQ. ID. No. 5. SEQ. ID. No. 6, SEQ. ID. No. 7, 
SEQ. ID. No. 8, SEQ. ID. No. 9, SEQ. ID. No. 10, SEQ. ID. No. 12, and 
SEQ. ID. No. 13; and 

detecting the formation of an antigen-antibody complex. 
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42. The method of claim 42, wherein the sample is from breast tissue. 

43. A method of inhibiting the pathological proliferation of cancer cells, the method comprising inhibiting the 
activity of a gene product of an endogenous gene having a subsequence which hybridizes under stringent 
conditions to a sequence selected from the group consisting of SEQ. ID. 1, SEQ. 

ID. No. 2, SEQ. ID. No. 3, SEQ. ID. No. 4.SEQ. ID. No. 5, SEQ. ID. No. 6, 
SEQ. ID. No. 7, SEQ. ID.No. 8, SEQ. ID. NO. 9, SEQ. ID. NO. 10, SEQ. 

ID.No. 12, and SEQ. ID. No. 13. 

44. A method of detecting a cancer, said method comprising detecting the overexpression of a protein 
encoded in a 20q13 amplicon. 

45. The method of claim 45, wherein said protein encoded in a 20q13 amplicon is ZABC1. 

46. The method of claim 45, wherein said protein encoded in a 20q13 amplicon islbl. 

Data supplied from the esp@cenet database - Worldwide 



http://v3.espacenetxoin/textclam?DB=EPODO(^IDX=EP0960197&Q 2/1/2008 



PCT 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 

C12N 15/11, C12Q 1/68, A61K 48/00 



Al 



(11) International Publication Number: WO 98/02539 

(43) International Publication Date: 22 January 1998 (22,01.98) 



(21) International Application Number: PCT/US97/I2343 

(22) International Filing Date: 15 July 1997 (15.07.97) 



(30) Priority Data: 
08/680395 
08/731,499 
08/785,532 



15 July 1996 (15.07.96) US 

16 October 1996 (16.10.96) US 

17 January 1997 (17.01.97) US 



(71) Applicant (far all designated States except US): THE 

REGENTS OF THE UNIVERSITY OF CALIFORNIA 
[US/US]; 22nd floor, 300 Lakeside Drive, Oakland, CA 
94612 (US). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): GRAY, Joe, W. [US/US]; 
1921 Uth Avenue, San Francisco, CA 94116 (US). 
COLLINS, Colin, Conrad [US/US]; 333 Mountain View 
Avenue, San Rafael, CA 94901 (US). HWANG, Soo-In 
[US/US]; 189 Fairiawn Drive, Berkeley, CA 94708 (US). 
GODFREY, Tony [GB/US]; 699 Teresita Boulevard, San 
Francisco, CA 94127 (US). KOWBEL, David [US/US]; 
6009 Auburn Avenue, Oakland, CA 94618 (US). ROM- 
MENS, Johanna [CA/CAt 717 Bay Street #1501, Toronto, 
Ontario MSG 2J9 (CA). 



(74) Agents: BASTIAN, Kevin, L. et al.; Townsend and Townsend 
and Crew LLP, 8th floor, Two Embarcadero Center, San 
Francisco, CA 94111 (US). 



(81) Designated Slates: AL, AM. AT, AU, AZ, BA, BB, BG, BR, 
BY, CA, CH, CN, CU, CZ, DE, DK, EE, ES, FI, GB, GE, 
GH, HU, IL, IS, JP, KE. KG, KP, KR, KZ, LC, LK, LR, 
LS, LT, LU, LV, MD, MG, MK, MN, MW, MX, NO, NZ, 
PL, PT, RO, RU, SD t SE, SG, SI, SK, SL, T>, TM, TO, 
IT, UA, UG, US, UZ, VN. YU, ZW, ARIPO patent (OH, 
KE, LS, MW, SD, SZ, UG, ZW), Eurasian patent (AM, AZ, 
BY, KG, KZ, MD, RU, TJ, 174), European patent (AT, BE, 
CH, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, 
PT, SE). OAPI patent (BF, BJ, CF, CG, CI. CM, GA, GN, 
ML, MR, NE, SN, TD, TG). 



Published 

With international search report 



(54) Title: GENES FROM 20q 13 AMPUCON AND THEIR USES 
(57) Abstract 

The present invention relates to cDNA sequences from a region of amplification on chromosome 20 associated with disease. The 
sequences can be used in hybridization methods for the identification of chromosomal abnormalities associated with various diseases. The 
sequences can also be used for treatment of diseases. 



